{
  "cells": [
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "UMqBL77hMXP2"
      },
      "source": [
        "# Text Classification pipeline using traditional Machine Learning methods\n",
        "\n",
        "Authors:  \n",
        " - [Lior Gazit](https://www.linkedin.com/in/liorgazit).  \n",
        " - [Meysam Ghaffari](https://www.linkedin.com/in/meysam-ghaffari-ph-d-a2553088/).  \n",
        "\n",
        "This notebook is taught and reviewed in our book:  \n",
        "**[Mastering NLP from Foundations to LLMs](https://www.amazon.com/dp/1804619183)**  \n",
        "![image.png]()\n",
        "\n",
        "This Colab notebook is referenced in our book's Github repo:   \n",
        "https://github.com/PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs   \n",
        "<a target=\"_blank\" href=\"https://colab.research.google.com/github/PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs/blob/liors_branch/Chapter5_notebooks/Ch5_Text_Classification_Traditional_ML.ipynb\">\n",
        "  <img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/>\n",
        "</a>"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "H3Sw2NqA3_D0"
      },
      "source": [
        "**Objective: Processing tweets to identify when a tweet is discussing a \"company | product news\".**\n",
        "This notebook demonstrates a complete, end-to-end ML system design of a binary classifier.  \n",
        "Given raw text of tweets, the classifier with identify which tweets are discussing a company or product news.  \n",
        "\n",
        "## The pipeline consists of:  \n",
        "1. Code settings  \n",
        "1. Gathering the data  \n",
        "1. Processing the data\n",
        "1. Prerocessing\n",
        "1. Preliminary data exploration  \n",
        "1. Feature engineering  \n",
        "1. Exploring the new numerical features\n",
        "1. Split to Train/Test\n",
        "1. Preliminary statistical analysis and feasibility study\n",
        "1. Feature selection\n",
        "1. Machine Learning  \n",
        " 11.1 Iterate over ML models  \n",
        " 11.2 Generate the chosen model  \n",
        " 11.3 Generating the train results  \n",
        " 11.4 Generating the test results  \n",
        "\n",
        "*Remark:  \n",
        "This is a complete ML pipeline that is designed to be fully inclusive in a single notebook file. This is meant to be an instruction tool.\n",
        "In a professional dev environment, the design should be distributed across reproducible `.py` files for reproducibility and efficiency.  \n",
        "\n",
        "## The Data:\n",
        "A data set from [Hugging Face:\n",
        "twitter-financial-news-topic](https://huggingface.co/datasets/zeroshot/twitter-financial-news-topic):   \n",
        ">>\n",
        "**\"**The Twitter Financial News dataset is an English-language dataset containing an annotated corpus of finance-related tweets. This dataset is used to classify finance-related tweets for their topic.  \n",
        ">>\n",
        "The dataset holds 21,107 documents annotated with 20 labels (note, we are re-labeling this dataset):  \n",
        "topics = {    \n",
        "    \"LABEL_0\": \"Analyst Update\",  \n",
        "    \"LABEL_1\": \"Fed | Central Banks\",  \n",
        "    \"**LABEL_2\": \"Company | Product News\"**, (note, we will focus on this label)  \n",
        "    \"LABEL_3\": \"Treasuries | Corporate Debt\",  \n",
        "    \"LABEL_4\": \"Dividend\",  \n",
        "    \"LABEL_5\": \"Earnings\",  \n",
        "    \"LABEL_6\": \"Energy | Oil\",  \n",
        "    \"LABEL_7\": \"Financials\",  \n",
        "    \"LABEL_8\": \"Currencies\",  \n",
        "    \"LABEL_9\": \"General News | Opinion\",  \n",
        "    \"LABEL_10\": \"Gold | Metals | Materials\",  \n",
        "    \"LABEL_11\": \"IPO\",  \n",
        "    \"LABEL_12\": \"Legal | Regulation\",  \n",
        "    \"LABEL_13\": \"M&A | Investments\",  \n",
        "    \"LABEL_14\": \"Macro\",  \n",
        "    \"LABEL_15\": \"Markets\",  \n",
        "    \"LABEL_16\": \"Politics\",  \n",
        "    \"LABEL_17\": \"Personnel Change\",  \n",
        "    \"LABEL_18\": \"Stock Commentary\",  \n",
        "    \"LABEL_19\": \"Stock Movement\",  \n",
        "}  \n",
        ">>\n",
        "The data was collected using the Twitter API. The current dataset supports the multi-class classification task.**\"**  \n",
        "\n",
        "\n",
        "**Requirements:**  \n",
        "* When running in Colab, use this runtime notebook setting: `Python 3, CPU`  "
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "g54Uf66Vz9Fi"
      },
      "source": [
        ">*```Disclaimer: The content and ideas presented in this notebook are solely those of the authors and do not represent the views or intellectual property of the authors' employers.```*"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "uJ1LYgXOpDJ6"
      },
      "source": [
        "Install:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 1,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "kO5KM7JCpDUP",
        "outputId": "0153d1a9-83d6-4a20-a736-de31a7947f16"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Collecting en-core-web-sm@ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl#sha256=86cc141f63942d4b2c5fcee06630fd6f904788d2f0ab005cce45aadb8fb73889 (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 3))\n",
            "  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl (12.8 MB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m12.8/12.8 MB\u001b[0m \u001b[31m47.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting autocorrect==2.6.1 (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 1))\n",
            "  Downloading autocorrect-2.6.1.tar.gz (622 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m622.8/622.8 kB\u001b[0m \u001b[31m7.0 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25h  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Collecting datasets==2.18.0 (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2))\n",
            "  Downloading datasets-2.18.0-py3-none-any.whl (510 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m510.5/510.5 kB\u001b[0m \u001b[31m10.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: huggingface-hub==0.20.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 4)) (0.20.3)\n",
            "Requirement already satisfied: matplotlib==3.7.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (3.7.1)\n",
            "Requirement already satisfied: matplotlib-inline==0.1.6 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 6)) (0.1.6)\n",
            "Requirement already satisfied: matplotlib-venn==0.11.10 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 7)) (0.11.10)\n",
            "Requirement already satisfied: nltk==3.8.1 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 8)) (3.8.1)\n",
            "Collecting num2words==0.5.13 (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 9))\n",
            "  Downloading num2words-0.5.13-py3-none-any.whl (143 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m143.3/143.3 kB\u001b[0m \u001b[31m13.1 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: regex==2023.12.25 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 10)) (2023.12.25)\n",
            "Requirement already satisfied: scikit-image==0.19.3 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 11)) (0.19.3)\n",
            "Requirement already satisfied: scikit-learn==1.2.2 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 12)) (1.2.2)\n",
            "Requirement already satisfied: scipy==1.11.4 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 13)) (1.11.4)\n",
            "Requirement already satisfied: spacy==3.7.4 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (3.7.4)\n",
            "Requirement already satisfied: spacy-legacy==3.0.12 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 15)) (3.0.12)\n",
            "Requirement already satisfied: spacy-loggers==1.0.5 in /usr/local/lib/python3.10/dist-packages (from -r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 16)) (1.0.5)\n",
            "Requirement already satisfied: filelock in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (3.13.4)\n",
            "Requirement already satisfied: numpy>=1.17 in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (1.25.2)\n",
            "Requirement already satisfied: pyarrow>=12.0.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (14.0.2)\n",
            "Requirement already satisfied: pyarrow-hotfix in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (0.6)\n",
            "Collecting dill<0.3.9,>=0.3.0 (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2))\n",
            "  Downloading dill-0.3.8-py3-none-any.whl (116 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m116.3/116.3 kB\u001b[0m \u001b[31m12.9 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: pandas in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2.0.3)\n",
            "Requirement already satisfied: requests>=2.19.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2.31.0)\n",
            "Requirement already satisfied: tqdm>=4.62.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (4.66.2)\n",
            "Collecting xxhash (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2))\n",
            "  Downloading xxhash-3.4.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (194 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m194.1/194.1 kB\u001b[0m \u001b[31m17.6 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hCollecting multiprocess (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2))\n",
            "  Downloading multiprocess-0.70.16-py310-none-any.whl (134 kB)\n",
            "\u001b[2K     \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m134.8/134.8 kB\u001b[0m \u001b[31m13.4 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
            "\u001b[?25hRequirement already satisfied: fsspec[http]<=2024.2.0,>=2023.1.0 in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2023.6.0)\n",
            "Requirement already satisfied: aiohttp in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (3.9.3)\n",
            "Requirement already satisfied: packaging in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (24.0)\n",
            "Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.10/dist-packages (from datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (6.0.1)\n",
            "Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.10/dist-packages (from huggingface-hub==0.20.3->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 4)) (4.11.0)\n",
            "Requirement already satisfied: contourpy>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (1.2.1)\n",
            "Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (0.12.1)\n",
            "Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (4.51.0)\n",
            "Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (1.4.5)\n",
            "Requirement already satisfied: pillow>=6.2.0 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (9.4.0)\n",
            "Requirement already satisfied: pyparsing>=2.3.1 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (3.1.2)\n",
            "Requirement already satisfied: python-dateutil>=2.7 in /usr/local/lib/python3.10/dist-packages (from matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (2.8.2)\n",
            "Requirement already satisfied: traitlets in /usr/local/lib/python3.10/dist-packages (from matplotlib-inline==0.1.6->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 6)) (5.7.1)\n",
            "Requirement already satisfied: click in /usr/local/lib/python3.10/dist-packages (from nltk==3.8.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 8)) (8.1.7)\n",
            "Requirement already satisfied: joblib in /usr/local/lib/python3.10/dist-packages (from nltk==3.8.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 8)) (1.4.0)\n",
            "Collecting docopt>=0.6.2 (from num2words==0.5.13->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 9))\n",
            "  Downloading docopt-0.6.2.tar.gz (25 kB)\n",
            "  Preparing metadata (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "Requirement already satisfied: networkx>=2.2 in /usr/local/lib/python3.10/dist-packages (from scikit-image==0.19.3->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 11)) (3.3)\n",
            "Requirement already satisfied: imageio>=2.4.1 in /usr/local/lib/python3.10/dist-packages (from scikit-image==0.19.3->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 11)) (2.31.6)\n",
            "Requirement already satisfied: tifffile>=2019.7.26 in /usr/local/lib/python3.10/dist-packages (from scikit-image==0.19.3->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 11)) (2024.2.12)\n",
            "Requirement already satisfied: PyWavelets>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from scikit-image==0.19.3->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 11)) (1.6.0)\n",
            "Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.10/dist-packages (from scikit-learn==1.2.2->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 12)) (3.4.0)\n",
            "Requirement already satisfied: murmurhash<1.1.0,>=0.28.0 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (1.0.10)\n",
            "Requirement already satisfied: cymem<2.1.0,>=2.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (2.0.8)\n",
            "Requirement already satisfied: preshed<3.1.0,>=3.0.2 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (3.0.9)\n",
            "Requirement already satisfied: thinc<8.3.0,>=8.2.2 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (8.2.3)\n",
            "Requirement already satisfied: wasabi<1.2.0,>=0.9.1 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (1.1.2)\n",
            "Requirement already satisfied: srsly<3.0.0,>=2.4.3 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (2.4.8)\n",
            "Requirement already satisfied: catalogue<2.1.0,>=2.0.6 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (2.0.10)\n",
            "Requirement already satisfied: weasel<0.4.0,>=0.1.0 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (0.3.4)\n",
            "Requirement already satisfied: typer<0.10.0,>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (0.9.4)\n",
            "Requirement already satisfied: smart-open<7.0.0,>=5.2.1 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (6.4.0)\n",
            "Requirement already satisfied: pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (2.6.4)\n",
            "Requirement already satisfied: jinja2 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (3.1.3)\n",
            "Requirement already satisfied: setuptools in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (67.7.2)\n",
            "Requirement already satisfied: langcodes<4.0.0,>=3.2.0 in /usr/local/lib/python3.10/dist-packages (from spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (3.3.0)\n",
            "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (1.3.1)\n",
            "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (23.2.0)\n",
            "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (1.4.1)\n",
            "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (6.0.5)\n",
            "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (1.9.4)\n",
            "Requirement already satisfied: async-timeout<5.0,>=4.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (4.0.3)\n",
            "Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (0.6.0)\n",
            "Requirement already satisfied: pydantic-core==2.16.3 in /usr/local/lib/python3.10/dist-packages (from pydantic!=1.8,!=1.8.1,<3.0.0,>=1.7.4->spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (2.16.3)\n",
            "Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.10/dist-packages (from python-dateutil>=2.7->matplotlib==3.7.1->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 5)) (1.16.0)\n",
            "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (3.3.2)\n",
            "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (3.6)\n",
            "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2.0.7)\n",
            "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests>=2.19.0->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2024.2.2)\n",
            "Requirement already satisfied: blis<0.8.0,>=0.7.8 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (0.7.11)\n",
            "Requirement already satisfied: confection<1.0.0,>=0.0.1 in /usr/local/lib/python3.10/dist-packages (from thinc<8.3.0,>=8.2.2->spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (0.1.4)\n",
            "Requirement already satisfied: cloudpathlib<0.17.0,>=0.7.0 in /usr/local/lib/python3.10/dist-packages (from weasel<0.4.0,>=0.1.0->spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (0.16.0)\n",
            "Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.10/dist-packages (from jinja2->spacy==3.7.4->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 14)) (2.1.5)\n",
            "Requirement already satisfied: pytz>=2020.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2023.4)\n",
            "Requirement already satisfied: tzdata>=2022.1 in /usr/local/lib/python3.10/dist-packages (from pandas->datasets==2.18.0->-r requirements__Ch5_Text_Classification_Traditional_ML.txt (line 2)) (2024.1)\n",
            "Building wheels for collected packages: autocorrect, docopt\n",
            "  Building wheel for autocorrect (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for autocorrect: filename=autocorrect-2.6.1-py3-none-any.whl size=622363 sha256=57d747a9e1e3647fa05c653b830616c4b6d6fc3c0b2e630ac0d1c04b34a7ac16\n",
            "  Stored in directory: /root/.cache/pip/wheels/b5/7b/6d/b76b29ce11ff8e2521c8c7dd0e5bfee4fb1789d76193124343\n",
            "  Building wheel for docopt (setup.py) ... \u001b[?25l\u001b[?25hdone\n",
            "  Created wheel for docopt: filename=docopt-0.6.2-py2.py3-none-any.whl size=13706 sha256=3212077bc0160472a4d4e2bb1c16d314456c95090fbf10afa5090d5d80b568e8\n",
            "  Stored in directory: /root/.cache/pip/wheels/fc/ab/d4/5da2067ac95b36618c629a5f93f809425700506f72c9732fac\n",
            "Successfully built autocorrect docopt\n",
            "Installing collected packages: docopt, xxhash, num2words, dill, autocorrect, multiprocess, datasets\n",
            "Successfully installed autocorrect-2.6.1 datasets-2.18.0 dill-0.3.8 docopt-0.6.2 multiprocess-0.70.16 num2words-0.5.13 xxhash-3.4.1\n"
          ]
        }
      ],
      "source": [
        "# REMARK:\n",
        "# If the below code error's out due to a Python package discrepency, it may be because new versions are causing it.\n",
        "# In which case, set \"default_installations\" to False to revert to the original image:\n",
        "default_installations = True\n",
        "if default_installations:\n",
        "  !pip -q install datasets num2words autocorrect\n",
        "else:\n",
        "  import requests\n",
        "  text_file_path = \"requirements__Ch5_Text_Classification_Traditional_ML.txt\"\n",
        "  url = \"https://raw.githubusercontent.com/PacktPublishing/Mastering-NLP-from-Foundations-to-LLMs/main/Chapter5_notebooks/\" + text_file_path\n",
        "  res = requests.get(url)\n",
        "  with open(text_file_path, \"w\") as f:\n",
        "    f.write(res.text)\n",
        "\n",
        "  !pip install -r requirements__Ch5_Text_Classification_Traditional_ML.txt"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "S_--320f-8Yi"
      },
      "source": [
        "Imports:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 2,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "E0zHJoN61Zu5",
        "outputId": "1b2d86ef-b1db-4580-e327-d135281d2ba3"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "[nltk_data] Downloading package punkt to /root/nltk_data...\n",
            "[nltk_data]   Unzipping tokenizers/punkt.zip.\n",
            "[nltk_data] Downloading package stopwords to /root/nltk_data...\n",
            "[nltk_data]   Unzipping corpora/stopwords.zip.\n",
            "[nltk_data] Downloading package wordnet to /root/nltk_data...\n"
          ]
        }
      ],
      "source": [
        "import numpy as np\n",
        "import pandas as pd\n",
        "import matplotlib\n",
        "\n",
        "import scipy\n",
        "import re\n",
        "from datasets import load_dataset\n",
        "\n",
        "from num2words import num2words\n",
        "import nltk; nltk.download('punkt'); nltk.download('stopwords'); nltk.download('wordnet')\n",
        "from nltk.corpus import stopwords\n",
        "from nltk.stem.porter import PorterStemmer\n",
        "from nltk.stem import WordNetLemmatizer\n",
        "from autocorrect import Speller\n",
        "\n",
        "# ML imports:\n",
        "from sklearn.feature_extraction.text import TfidfVectorizer,CountVectorizer\n",
        "from sklearn.model_selection import cross_val_score, GridSearchCV, StratifiedKFold\n",
        "import sklearn.linear_model as lm\n",
        "from sklearn.ensemble import RandomForestClassifier\n",
        "from sklearn.neighbors import KNeighborsClassifier\n",
        "from sklearn.svm import SVC\n",
        "from sklearn.tree import DecisionTreeClassifier\n",
        "from sklearn.metrics import classification_report\n",
        "from sklearn.metrics import confusion_matrix\n",
        "\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "Ql6clhha--CD"
      },
      "source": [
        "### Code Settings\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 3,
      "metadata": {
        "id": "02lniQAu-6ou"
      },
      "outputs": [],
      "source": [
        "# Items:\n",
        "# db_name: The db name from HuggingFace that holds the raw data\n",
        "# do_preprocessing: Logical, should preprocessing be performed\n",
        "# do_enhanced_preprocessing: Logical, should the computation-heavy preprocessing be performed\n",
        "# do_feature_eng: Logical\n",
        "# maximize_a_priori: Logocal, should the univariate preliminary feature selection be based on a priori or a postiori stats\n",
        "# num_chosen_features_per_class: Int, for the preliminary feature selection, how many features should be selected per class\n",
        "# test_size: ratio between 0 - 1\n",
        "# feature_eng_details: Either \"TfidfVectorizer\" (for TFIDF feature eng.) or \"CountVectorizer\" (for one hot encoding)\n",
        "# seed: Integer, the random seed used to insure reproducibility of results\n",
        "config_dict = {'db_name': \"zeroshot/twitter-financial-news-topic\",\n",
        "               'do_preprocessing': True,\n",
        "               'do_enhanced_preprocessing': False,\n",
        "               'do_feature_eng': True,\n",
        "               'maximize_a_priori': True,\n",
        "               'num_chosen_features_per_class': 200,\n",
        "               'test_size': 0.2,\n",
        "               'feature_eng_details': \"CountVectorizer-binary\",\n",
        "               'ngram_range_min': 1,\n",
        "               'ngram_range_max': 2,\n",
        "               'max_features': 1000,\n",
        "               'seed': 0}\n",
        "\n",
        "pd.set_option('display.max_colwidth', None)\n",
        "pd.set_option('display.max_rows', None)\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "3d6x7m9s_eh0"
      },
      "source": [
        "### Gathering the Data"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 4,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 301,
          "referenced_widgets": [
            "748de276f13f41a685e2af7072f3c715",
            "14dc9b9eb36946199b4c3af3bcb26969",
            "9d8d08f40e734ef1aa345d628893c621",
            "fc122e69d02a47ccb5bb751ba273f25c",
            "5554cebca18943ecb933772e1ab390e7",
            "637d14198225420ebe39e1473c4e384b",
            "7551693c02134155b91b9ab41663030b",
            "beebbed93da24644b35592bc9fce8156",
            "81fde237bd974ae4bac2f54b7e0e8fd9",
            "ba4fef970e684538aef0ddadb22e03d5",
            "5594c2f22edd4ddc936d5f24e88177d7",
            "1fc676fcf8364c84bff1e5bfca25ff96",
            "238e0af5085a4a93bb4d66e26610b7ee",
            "66d7bd40c2344716b5b57a9cb19d275a",
            "294b50433db146b7b368d5fd1a057568",
            "6118939ea4c84b9fb24fed74afc39574",
            "c2148be1cf704992bc76621abee97324",
            "89468e4eecf94e7ebcf5bec407af5edd",
            "716d1dd8ad2942d3bc31b0377fc7382e",
            "84d7f755cb8040ac8885318ef9986f7f",
            "48b52c41ef6b4e5ab9196aa8e2d963d4",
            "36926767be6e4706ae0c8d25979ca2b3",
            "4560a66e9ac6449fa3cacd0635c8cd9c",
            "f2eb53fae69a4a59b4c525041cbdf750",
            "621c11b74e9045f587cf6718b4a072c2",
            "ce00568869fd447086f301c78653331f",
            "bf79ed7ce8d34a93a6fb01ad9f9f70d2",
            "ebae02b066f04b5c89b14a135603e327",
            "c2acd202411447a19291cca13bc05a23",
            "dc280c0ec8ce41389fc16979c9df55d6",
            "370d37ca6a094e1ab3512ffb03ebabcc",
            "27cd32faf9304275a4c916e5636c43db",
            "d940b1cf380d43969bcdc9313603b691",
            "32758e000b91482e8f3ea6593ff2a38e",
            "474a5a77556140c5b961a83331acd113",
            "e276e3c1c48f48709e4f6a0ede6a6f11",
            "52043fd851df47109b6f49792918df1e",
            "8f9b145022124e6a93de9c7ee68e7dab",
            "6592833b05db42edbbbd75e9e1c23047",
            "0b8d4b08978b4effab529c0e3d7edd55",
            "aeedd1b5ca9b4d43a85fbd99ec21724e",
            "584d4df165a045b8b757b857239b1355",
            "491d61801d434be9aa737a6e432ddf73",
            "24e9d61d181041ab8a4e49218ab9aefd",
            "3ace061f2c0b4196b329a18f45834a20",
            "b381912409de4c268111d685e6ae9706",
            "dfa1cdad61734a4dbf5e3865c8f09495",
            "756364a0d24d4a3e92aa58e27a94d35d",
            "4465c14c53d848c1b90abf46052a3160",
            "52ceda6fa527475090b5e400101615f7",
            "1af3968ced8f452aa321dd1f4fe0da65",
            "2d3cc6d7b6554f538383fda744484ec8",
            "4c9ee2bebe564ffda10ded823cdd6ed4",
            "b4bce40ddf4d4c3b9e9398142b8bf935",
            "cd5f938d3f8342e1b0adcd9b9aa627f5"
          ]
        },
        "id": "iJd1W79I4JmR",
        "outputId": "8517f76c-472e-4a69-e050-00fa26e45256"
      },
      "outputs": [
        {
          "name": "stderr",
          "output_type": "stream",
          "text": [
            "/usr/local/lib/python3.10/dist-packages/huggingface_hub/utils/_token.py:88: UserWarning: \n",
            "The secret `HF_TOKEN` does not exist in your Colab secrets.\n",
            "To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.\n",
            "You will be able to reuse this secret in all of your notebooks.\n",
            "Please note that authentication is recommended but still optional to access public models or datasets.\n",
            "  warnings.warn(\n"
          ]
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "748de276f13f41a685e2af7072f3c715",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Downloading readme:   0%|          | 0.00/1.97k [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "1fc676fcf8364c84bff1e5bfca25ff96",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Downloading data:   0%|          | 0.00/2.39M [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "4560a66e9ac6449fa3cacd0635c8cd9c",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Downloading data:   0%|          | 0.00/580k [00:00<?, ?B/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "32758e000b91482e8f3ea6593ff2a38e",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Generating train split: 0 examples [00:00, ? examples/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        },
        {
          "data": {
            "application/vnd.jupyter.widget-view+json": {
              "model_id": "3ace061f2c0b4196b329a18f45834a20",
              "version_major": 2,
              "version_minor": 0
            },
            "text/plain": [
              "Generating validation split: 0 examples [00:00, ? examples/s]"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "dataset_raw = load_dataset(config_dict[\"db_name\"])"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "utLjs7A60MSQ"
      },
      "source": [
        "### Processing the Data\n",
        "Setting up the data so to include the cases we care about.  "
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "r9aEAKgM7yUB"
      },
      "source": [
        "Create one complete dataframe.  \n",
        "Note that HuggingFace originally split the dataset in to two subsets, train and validation.  \n",
        "We are concatenating them and later we will split them with a ratio of our choice.   "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 5,
      "metadata": {
        "id": "d9wi1mAY7yeO"
      },
      "outputs": [],
      "source": [
        "first_df = pd.DataFrame(dataset_raw[\"train\"])\n",
        "second_df = pd.DataFrame(dataset_raw[\"validation\"])\n",
        "dataset_df = pd.concat([first_df, second_df]).reset_index(drop=True)\n",
        "dataset_df = dataset_df.rename(columns={\"label\": \"_label_\"})"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "gOo5mBx004k7"
      },
      "source": [
        "Let's have a quick look at the raw data:  "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 6,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 363
        },
        "id": "L5wF9r9A04uA",
        "outputId": "01760c06-0caa-4c01-8cdb-f9adf0ed133b"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<style type=\"text/css\">\n",
              "#T_9353d_row0_col0, #T_9353d_row0_col1, #T_9353d_row1_col0, #T_9353d_row1_col1, #T_9353d_row2_col0, #T_9353d_row2_col1, #T_9353d_row3_col0, #T_9353d_row3_col1, #T_9353d_row4_col0, #T_9353d_row4_col1, #T_9353d_row5_col0, #T_9353d_row5_col1, #T_9353d_row6_col0, #T_9353d_row6_col1, #T_9353d_row7_col0, #T_9353d_row7_col1, #T_9353d_row8_col0, #T_9353d_row8_col1, #T_9353d_row9_col0, #T_9353d_row9_col1 {\n",
              "  text-align: left;\n",
              "}\n",
              "</style>\n",
              "<table id=\"T_9353d\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr>\n",
              "      <th class=\"blank level0\" >&nbsp;</th>\n",
              "      <th id=\"T_9353d_level0_col0\" class=\"col_heading level0 col0\" >text</th>\n",
              "      <th id=\"T_9353d_level0_col1\" class=\"col_heading level0 col1\" >_label_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
              "      <td id=\"T_9353d_row0_col0\" class=\"data row0 col0\" >Here are Thursday's biggest analyst calls: Apple, Amazon, Tesla, Palantir, DocuSign, Exxon &amp; more  https://t.co/QPN8Gwl7Uh</td>\n",
              "      <td id=\"T_9353d_row0_col1\" class=\"data row0 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
              "      <td id=\"T_9353d_row1_col0\" class=\"data row1 col0\" >Buy Las Vegas Sands as travel to Singapore builds, Wells Fargo says  https://t.co/fLS2w57iCz</td>\n",
              "      <td id=\"T_9353d_row1_col1\" class=\"data row1 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
              "      <td id=\"T_9353d_row2_col0\" class=\"data row2 col0\" >Piper Sandler downgrades DocuSign to sell, citing elevated risks amid CEO transition  https://t.co/1EmtywmYpr</td>\n",
              "      <td id=\"T_9353d_row2_col1\" class=\"data row2 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
              "      <td id=\"T_9353d_row3_col0\" class=\"data row3 col0\" >Analysts react to Tesla's latest earnings, break down what's next for electric car maker  https://t.co/kwhoE6W06u</td>\n",
              "      <td id=\"T_9353d_row3_col1\" class=\"data row3 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
              "      <td id=\"T_9353d_row4_col0\" class=\"data row4 col0\" >Netflix and its peers are set for a ‘return to growth,’ analysts say, giving one stock 120% upside  https://t.co/jPpdl0D9s4</td>\n",
              "      <td id=\"T_9353d_row4_col1\" class=\"data row4 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
              "      <td id=\"T_9353d_row5_col0\" class=\"data row5 col0\" >Barclays believes earnings for these underperforming stocks may surprise Wall Street  https://t.co/PHbsyVGAyE</td>\n",
              "      <td id=\"T_9353d_row5_col1\" class=\"data row5 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
              "      <td id=\"T_9353d_row6_col0\" class=\"data row6 col0\" >Bernstein upgrades Alibaba, says shares can rally more than 20% from here  https://t.co/m3ApoPRGU0</td>\n",
              "      <td id=\"T_9353d_row6_col1\" class=\"data row6 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
              "      <td id=\"T_9353d_row7_col0\" class=\"data row7 col0\" >Analysts react to Netflix's strong quarter, with some pointing to a potential bottom for the stock  https://t.co/cQngJsyefD</td>\n",
              "      <td id=\"T_9353d_row7_col1\" class=\"data row7 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row8\" class=\"row_heading level0 row8\" >8</th>\n",
              "      <td id=\"T_9353d_row8_col0\" class=\"data row8 col0\" >Buy Chevron as shares look attractive at these levels, HSBC says  https://t.co/GkDpFvxjEP</td>\n",
              "      <td id=\"T_9353d_row8_col1\" class=\"data row8 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_9353d_level0_row9\" class=\"row_heading level0 row9\" >9</th>\n",
              "      <td id=\"T_9353d_row9_col0\" class=\"data row9 col0\" >Morgan Stanley says these global stocks are set for earnings beats — and gives one over 45% upside  https://t.co/GeWxa5YoWr</td>\n",
              "      <td id=\"T_9353d_row9_col1\" class=\"data row9 col1\" >0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n"
            ],
            "text/plain": [
              "<pandas.io.formats.style.Styler at 0x7c21f0541390>"
            ]
          },
          "execution_count": 6,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_df.head(10).style.set_properties(**{'text-align': 'left'})"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 7,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "pZ7hseIDJbX0",
        "outputId": "b9df80d6-3f01-4a64-e472-4120970cf86e"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Distribution of original labels:\n",
            "\n"
          ]
        },
        {
          "data": {
            "text/plain": [
              "_label_\n",
              "2          4397\n",
              "18         2646\n",
              "14         2237\n",
              "9          1893\n",
              "16         1234\n",
              "5          1229\n",
              "1          1051\n",
              "19         1020\n",
              "7           784\n",
              "6           670\n",
              "15          626\n",
              "17          607\n",
              "12          606\n",
              "13          587\n",
              "4           456\n",
              "3           398\n",
              "0           328\n",
              "8           198\n",
              "10           82\n",
              "11           58\n",
              "Name: count, dtype: int64"
            ]
          },
          "execution_count": 7,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "print(\"Distribution of original labels:\\n\")\n",
        "dataset_df[[\"_label_\"]].value_counts()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "uJIfEYzYzzC4"
      },
      "source": [
        "We are going to focus on one particular topic: **\"Company | Product News\"** (label 2)    \n",
        "So we are going to re-label:  \n",
        ">>\n",
        "Label 0: **Not** Company | Product News  \n",
        "Label 1: Company | Product News\n",
        "\n",
        "So now the classification problem is a binary classification problem.  "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 8,
      "metadata": {
        "id": "pbe_tsjmzzMX"
      },
      "outputs": [],
      "source": [
        "dataset_df_binary = dataset_df.copy()\n",
        "dataset_df_binary[\"_label_\"] = dataset_df_binary[\"_label_\"].map({2:1}).fillna(0).map(int)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 9,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "atVRw5FFIMaJ",
        "outputId": "10334a1c-57b3-4cbc-cf4b-e3070263a755"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Distribution of new labels:\n",
            "\n"
          ]
        },
        {
          "data": {
            "text/plain": [
              "_label_\n",
              "0          16710\n",
              "1           4397\n",
              "Name: count, dtype: int64"
            ]
          },
          "execution_count": 9,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "print(\"Distribution of new labels:\\n\")\n",
        "frequencies = dataset_df_binary[[\"_label_\"]].value_counts()\n",
        "frequencies"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 10,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "Fq9QEVWXbTMX",
        "outputId": "a3550f8a-9a2c-4468-fe9e-968d9decf4bd"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "The most frequent class is: 0\n",
            "And its baseline probablity is: 0.792\n"
          ]
        }
      ],
      "source": [
        "most_frequent_class = frequencies.index[:][0][0]\n",
        "print(\"The most frequent class is:\", most_frequent_class)\n",
        "print(\"And its baseline probablity is:\", round((dataset_df_binary[\"_label_\"] == most_frequent_class).mean(), 3))"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "w5KVvs7J_giZ"
      },
      "source": [
        "### Preporcessing\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "RJm8NNust0H0"
      },
      "source": [
        "Define the preprocessing utility functions:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 11,
      "metadata": {
        "id": "uNJaNvEYt0SM"
      },
      "outputs": [],
      "source": [
        "def digits_to_words(match):\n",
        "  \"\"\"\n",
        "  Convert string digits to the English words. The function distinguishes between\n",
        "  cardinal and ordinal.\n",
        "  E.g. \"2\" becomes \"two\", while \"2nd\" becomes \"second\"\n",
        "\n",
        "  Input: str\n",
        "  Output: str\n",
        "  \"\"\"\n",
        "  suffixes = ['st', 'nd', 'rd', 'th']\n",
        "  # Making sure it's lower cased so not to rely on previous possible actions:\n",
        "  string = match[0].lower()\n",
        "  if string[-2:] in suffixes:\n",
        "    type='ordinal'\n",
        "    string = string[:-2]\n",
        "  else:\n",
        "    type='cardinal'\n",
        "\n",
        "  return num2words(string, to=type)\n",
        "\n",
        "\n",
        "def spelling_correction(text):\n",
        "    \"\"\"\n",
        "    Replace misspelled words with the correct spelling.\n",
        "\n",
        "    Input: str\n",
        "    Output: str\n",
        "    \"\"\"\n",
        "    corrector = Speller()\n",
        "    spells = [corrector(word) for word in text.split()]\n",
        "    return \" \".join(spells)\n",
        "\n",
        "\n",
        "def remove_stop_words(text):\n",
        "    \"\"\"\n",
        "    Remove stopwords.\n",
        "\n",
        "    Input: str\n",
        "    Output: str\n",
        "    \"\"\"\n",
        "    stopwords_set = set(stopwords.words('english'))\n",
        "    return \" \".join([word for word in text.split() if word not in stopwords_set])\n",
        "\n",
        "\n",
        "def stemming(text):\n",
        "    \"\"\"\n",
        "    Perform stemming of each word individually.\n",
        "\n",
        "    Input: str\n",
        "    Output: str\n",
        "    \"\"\"\n",
        "    stemmer = PorterStemmer()\n",
        "    return \" \".join([stemmer.stem(word) for word in text.split()])\n",
        "\n",
        "\n",
        "def lemmatizing(text):\n",
        "    \"\"\"\n",
        "    Perform lemmatization for each word individually.\n",
        "\n",
        "    Input: str\n",
        "    Output: str\n",
        "    \"\"\"\n",
        "    lemmatizer = WordNetLemmatizer()\n",
        "    return \" \".join([lemmatizer.lemmatize(word) for word in text.split()])\n",
        "\n",
        "\n",
        "\n",
        "\n",
        "def preprocessing(input_text):\n",
        "  \"\"\"\n",
        "  This function represents a complete pipeline for text preprocessing.\n",
        "\n",
        "  Input: str\n",
        "  Output: str\n",
        "  \"\"\"\n",
        "  output = input_text\n",
        "  # Lower casing:\n",
        "  output = output.lower()\n",
        "  # Remove URLs:\n",
        "  output = re.sub(r'http\\S+', \"\", output)\n",
        "  # Convert digits to words:\n",
        "  # The following regex syntax looks for matching of consequtive digits tentatively followed by an ordinal suffix:\n",
        "  output = re.sub(r'\\d+(st)?(nd)?(rd)?(th)?', digits_to_words, output, flags=re.IGNORECASE)\n",
        "  # Remove punctuations and other special characters:\n",
        "  output = re.sub('[^ A-Za-z0-9]+', '', output)\n",
        "\n",
        "  if config_dict[\"do_enhanced_preprocessing\"]:\n",
        "    # Spelling corrections:\n",
        "    output = spelling_correction(output)\n",
        "\n",
        "  # Remove stop words:\n",
        "  output = remove_stop_words(output)\n",
        "\n",
        "  if config_dict[\"do_enhanced_preprocessing\"]:\n",
        "    # Stemming:\n",
        "    output = stemming(output)\n",
        "    # Lemmatizing:\n",
        "    output = lemmatizing(output)\n",
        "\n",
        "  return output"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "bgrWf0mEt0bm"
      },
      "source": [
        "Perform the preprocessing:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 12,
      "metadata": {
        "id": "IEedfizM6FtZ"
      },
      "outputs": [],
      "source": [
        "dataset_clean = dataset_df_binary.copy()\n",
        "if config_dict[\"do_preprocessing\"]:\n",
        "  dataset_clean[\"text\"] = [preprocessing(text) for text in dataset_clean[\"text\"]]"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 13,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 363
        },
        "id": "MSYLrwr7KREy",
        "outputId": "284a7ee2-8aaa-43cf-dac3-f2a911626b93"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<style type=\"text/css\">\n",
              "#T_45ebd_row0_col0, #T_45ebd_row0_col1, #T_45ebd_row1_col0, #T_45ebd_row1_col1, #T_45ebd_row2_col0, #T_45ebd_row2_col1, #T_45ebd_row3_col0, #T_45ebd_row3_col1, #T_45ebd_row4_col0, #T_45ebd_row4_col1, #T_45ebd_row5_col0, #T_45ebd_row5_col1, #T_45ebd_row6_col0, #T_45ebd_row6_col1, #T_45ebd_row7_col0, #T_45ebd_row7_col1, #T_45ebd_row8_col0, #T_45ebd_row8_col1, #T_45ebd_row9_col0, #T_45ebd_row9_col1 {\n",
              "  text-align: left;\n",
              "}\n",
              "</style>\n",
              "<table id=\"T_45ebd\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr>\n",
              "      <th class=\"blank level0\" >&nbsp;</th>\n",
              "      <th id=\"T_45ebd_level0_col0\" class=\"col_heading level0 col0\" >text</th>\n",
              "      <th id=\"T_45ebd_level0_col1\" class=\"col_heading level0 col1\" >_label_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row0\" class=\"row_heading level0 row0\" >0</th>\n",
              "      <td id=\"T_45ebd_row0_col0\" class=\"data row0 col0\" >thursdays biggest analyst calls apple amazon tesla palantir docusign exxon amp</td>\n",
              "      <td id=\"T_45ebd_row0_col1\" class=\"data row0 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row1\" class=\"row_heading level0 row1\" >1</th>\n",
              "      <td id=\"T_45ebd_row1_col0\" class=\"data row1 col0\" >buy las vegas sands travel singapore builds wells fargo says</td>\n",
              "      <td id=\"T_45ebd_row1_col1\" class=\"data row1 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row2\" class=\"row_heading level0 row2\" >2</th>\n",
              "      <td id=\"T_45ebd_row2_col0\" class=\"data row2 col0\" >piper sandler downgrades docusign sell citing elevated risks amid ceo transition</td>\n",
              "      <td id=\"T_45ebd_row2_col1\" class=\"data row2 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row3\" class=\"row_heading level0 row3\" >3</th>\n",
              "      <td id=\"T_45ebd_row3_col0\" class=\"data row3 col0\" >analysts react teslas latest earnings break whats next electric car maker</td>\n",
              "      <td id=\"T_45ebd_row3_col1\" class=\"data row3 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row4\" class=\"row_heading level0 row4\" >4</th>\n",
              "      <td id=\"T_45ebd_row4_col0\" class=\"data row4 col0\" >netflix peers set return growth analysts say giving one stock one hundred twenty upside</td>\n",
              "      <td id=\"T_45ebd_row4_col1\" class=\"data row4 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
              "      <td id=\"T_45ebd_row5_col0\" class=\"data row5 col0\" >barclays believes earnings underperforming stocks may surprise wall street</td>\n",
              "      <td id=\"T_45ebd_row5_col1\" class=\"data row5 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row6\" class=\"row_heading level0 row6\" >6</th>\n",
              "      <td id=\"T_45ebd_row6_col0\" class=\"data row6 col0\" >bernstein upgrades alibaba says shares rally twenty</td>\n",
              "      <td id=\"T_45ebd_row6_col1\" class=\"data row6 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row7\" class=\"row_heading level0 row7\" >7</th>\n",
              "      <td id=\"T_45ebd_row7_col0\" class=\"data row7 col0\" >analysts react netflixs strong quarter pointing potential bottom stock</td>\n",
              "      <td id=\"T_45ebd_row7_col1\" class=\"data row7 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row8\" class=\"row_heading level0 row8\" >8</th>\n",
              "      <td id=\"T_45ebd_row8_col0\" class=\"data row8 col0\" >buy chevron shares look attractive levels hsbc says</td>\n",
              "      <td id=\"T_45ebd_row8_col1\" class=\"data row8 col1\" >0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th id=\"T_45ebd_level0_row9\" class=\"row_heading level0 row9\" >9</th>\n",
              "      <td id=\"T_45ebd_row9_col0\" class=\"data row9 col0\" >morgan stanley says global stocks set earnings beats gives one fortyfive upside</td>\n",
              "      <td id=\"T_45ebd_row9_col1\" class=\"data row9 col1\" >0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n"
            ],
            "text/plain": [
              "<pandas.io.formats.style.Styler at 0x7c21ec85fbb0>"
            ]
          },
          "execution_count": 13,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_clean.head(10).style.set_properties(**{'text-align': 'left'})"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "Ht259WodQFLK"
      },
      "source": [
        "## Preliminary data exploration\n",
        "Every ML project should start with basic exploration of the data.  \n",
        "The objectives are mainly to explore the nature of the data, and to study whether there is a connection between the \"X\" data (in our case, tweets' text) and the desired value, \"Y\" (in our case, classifying the topic).  \n",
        "\n",
        "Here we start by looking at the most simple characteristic of the data: the length of each tweet.  \n",
        "We will later explore the statistical dependence between the language used and the topic label.  "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 14,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 582
        },
        "id": "zbe9QF-lNnLs",
        "outputId": "70e15877-2cf9-4d21-9c02-e6daf2042ca1"
      },
      "outputs": [
        {
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAA2wAAAI1CAYAAACqmnnaAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAAB0SElEQVR4nO3deVxWZf7/8fcNyKYsboAoKrmluaWWkUtuhcqYW4vlgubkVGiuZU5pmhmpk6lZ2syUS+lYNmqT5oJ7FrnlXuGSSiWLpYBLosD1+8Mf59stuIDccAuv5+NxPx6ec677XJ9ze3HL23POdWzGGCMAAAAAgNNxKeoCAAAAAAC5I7ABAAAAgJMisAEAAACAkyKwAQAAAICTIrABAAAAgJMisAEAAACAkyKwAQAAAICTIrABAAAAgJMisAEAAACAkyKwASgU48ePl81mK5S+2rRpozZt2ljLmzZtks1m02effVYo/ffv31/Vq1cvlL7y69y5c/rrX/+qoKAg2Ww2DRs2rKhLuu3NmzdPNptNx48fL/S+C/Pn61YU9s9iUSnKsQCg+CGwAciz7F9Gsl+enp4KDg5WeHi4Zs6cqbNnzxZIPydPntT48eO1Z8+eAtlfQXLm2m7GG2+8oXnz5unZZ5/VRx99pL59+1637fLlywuvuDz6/vvvNX78+AL95fi9997TvHnzCmx/Jc2iRYs0ffr0oi4DAIoFAhuAfHvttdf00Ucfafbs2RoyZIgkadiwYWrQoIH27dtn1/aVV17RH3/8kaf9nzx5UhMmTMhzKFq7dq3Wrl2bp/fk1fVq+9e//qW4uDiH9n+rNmzYoPvuu0+vvvqq+vTpo6ZNm16z7e0Q2CZMmFDkga1v3776448/VK1atQKr43ZFYAOAguNW1AUAuH116tRJzZo1s5bHjBmjDRs26C9/+Ysefvhh/fDDD/Ly8pIkubm5yc3NsV85Fy5ckLe3t9zd3R3az42UKlWqSPu/GcnJyapXr15Rl1EsnD9/XqVLl5arq6tcXV2LuhwUgOzvEgBwBpxhA1Cg2rVrp7Fjx+rEiRP6+OOPrfW53WMTExOjli1byt/fX2XKlFGdOnX097//XdKVe13uueceSdKAAQOsyy+zz3q0adNG9evX165du9S6dWt5e3tb7736HrZsmZmZ+vvf/66goCCVLl1aDz/8sH7++We7NtWrV1f//v1zvPfP+7xRbbndw3b+/HmNHDlSISEh8vDwUJ06dfSPf/xDxhi7djabTYMHD9by5ctVv359eXh46K677tLq1atz/8CvkpycrIEDByowMFCenp5q1KiR5s+fb23Pvofo2LFjWrlypVX7tc5O2Ww2nT9/XvPnz7fa9u/fX/v27ZPNZtP//vc/q+2uXbtks9nUpEkTu3106tRJzZs3t1u3atUqtWrVSqVLl5aPj48iIiJ08ODBHP3/+OOPeuSRR1SuXDl5enqqWbNmdn3OmzdPjz76qCSpbdu2Vo2bNm265meUmJioAQMGqEqVKvLw8FClSpXUtWtX6zOoXr26Dh48qM2bN1v7y/67z74cePPmzXruuecUEBCgKlWq2G3782dZvXp1/eUvf9HWrVt17733ytPTU3fccYcWLFiQo659+/bpgQcekJeXl6pUqaLXX39dc+fOvaV7oT7++GM1bdpUXl5eKleunHr16pVjzGf/LH3//fdq27atvL29VblyZU2ZMiXH/k6cOKGHH35YpUuXVkBAgIYPH641a9bYfeZt2rTRypUrdeLECevzu/rnISsrS5MmTVKVKlXk6emp9u3b68iRIzc8nuzvkR9//FGPPfaYfH19Vb58eQ0dOlQXL168pePP7bvkWrL7r1ixory8vFSnTh29/PLL133P559/roiICAUHB8vDw0M1atTQxIkTlZmZadfu8OHD6tmzp4KCguTp6akqVaqoV69eSk1Ntdpc77sTQPHDGTYABa5v3776+9//rrVr1+rpp5/Otc3Bgwf1l7/8RQ0bNtRrr70mDw8PHTlyRF9//bUkqW7dunrttdc0btw4DRo0SK1atZIk3X///dY+fv/9d3Xq1Em9evVSnz59FBgYeN26Jk2aJJvNptGjRys5OVnTp09Xhw4dtGfPHutM4M24mdr+zBijhx9+WBs3btTAgQPVuHFjrVmzRi+88IJ+/fVXvf3223btt27dqqVLl+q5556Tj4+PZs6cqZ49eyo+Pl7ly5e/Zl1//PGH2rRpoyNHjmjw4MEKDQ3VkiVL1L9/f6WkpGjo0KGqW7euPvroIw0fPlxVqlTRyJEjJUkVK1bMdZ8fffSR/vrXv+ree+/VoEGDJEk1atRQ/fr15e/vry1btujhhx+WJH311VdycXHR3r17lZaWJl9fX2VlZembb76x3pu9z8jISIWHh2vy5Mm6cOGCZs+erZYtW2r37t3WL/cHDx5UixYtVLlyZb300ksqXbq0Pv30U3Xr1k3//e9/1b17d7Vu3VrPP/+8Zs6cqb///e+qW7eu9Xd0LT179tTBgwc1ZMgQVa9eXcnJyYqJiVF8fLyqV6+u6dOna8iQISpTpoz1S/jVY+u5555TxYoVNW7cOJ0/f/6afUnSkSNH9Mgjj2jgwIGKjIzUhx9+qP79+6tp06a66667JEm//vqrFTjHjBmj0qVL69///rc8PDyuu+/rmTRpksaOHavHHntMf/3rX3Xq1Cm98847at26tXbv3i1/f3+r7ZkzZ9SxY0f16NFDjz32mD777DONHj1aDRo0UKdOnSRd+U+Hdu3aKSEhQUOHDlVQUJAWLVqkjRs32vX78ssvKzU1Vb/88os1tsuUKWPX5s0335SLi4tGjRql1NRUTZkyRb1799a2bdtu6tgee+wxVa9eXdHR0fr22281c+ZMnTlzxi4I5+X48/Jdsm/fPrVq1UqlSpXSoEGDVL16dR09elRffPGFJk2adM33zZs3T2XKlNGIESNUpkwZbdiwQePGjVNaWpqmTp0qSbp06ZLCw8OVnp6uIUOGKCgoSL/++qtWrFihlJQU+fn53fC7E0AxZAAgj+bOnWskmR07dlyzjZ+fn7n77rut5VdffdX8+Svn7bffNpLMqVOnrrmPHTt2GElm7ty5ObY98MADRpKZM2dOrtseeOABa3njxo1GkqlcubJJS0uz1n/66adGkpkxY4a1rlq1aiYyMvKG+7xebZGRkaZatWrW8vLly40k8/rrr9u1e+SRR4zNZjNHjhyx1kky7u7uduv27t1rJJl33nknR19/Nn36dCPJfPzxx9a6S5cumbCwMFOmTBm7Y69WrZqJiIi47v6ylS5dOtfPJCIiwtx7773Wco8ePUyPHj2Mq6urWbVqlTHGmO+++85IMp9//rkxxpizZ88af39/8/TTT9vtKzEx0fj5+dmtb9++vWnQoIG5ePGitS4rK8vcf//9platWta6JUuWGElm48aNNzyWM2fOGElm6tSp121311132f19Z8se+y1btjQZGRm5bjt27Ji1rlq1akaS2bJli7UuOTnZeHh4mJEjR1rrhgwZYmw2m9m9e7e17vfffzflypXLsc/cXP3zdfz4cePq6momTZpk127//v3Gzc3Nbn32z9KCBQusdenp6SYoKMj07NnTWvfWW28ZSWb58uXWuj/++MPceeedOT7/iIgIu5+BbNk/i3Xr1jXp6enW+hkzZhhJZv/+/Td1nA8//LDd+ueee85IMnv37s338ef2XZKb1q1bGx8fH3PixAm79VlZWdafcxsLFy5cyLGvv/3tb8bb29sa47t37zaSzJIlS67Z/818dwIoXrgkEoBDlClT5rqzRWb/7/bnn3+urKysfPXh4eGhAQMG3HT7fv36ycfHx1p+5JFHVKlSJX355Zf56v9mffnll3J1ddXzzz9vt37kyJEyxmjVqlV26zt06KAaNWpYyw0bNpSvr69++umnG/YTFBSkJ554wlpXqlQpPf/88zp37pw2b95cAEfzf1q1aqXvvvvOOsO0detWde7cWY0bN9ZXX30l6cpZN5vNppYtW0q6cilXSkqKnnjiCf3222/Wy9XVVc2bN7fO1pw+fVobNmzQY489prNnz1rtfv/9d4WHh+vw4cP69ddf81yzl5eX3N3dtWnTJp05cybfx/7000/f9P1q9erVs87CSlfOZtapU8fu73P16tUKCwtT48aNrXXlypVT796981Xf0qVLlZWVpccee8zucw4KClKtWrVynBUrU6aM+vTpYy27u7vr3nvvzVFj5cqVrTOqkuTp6XnNs+jXM2DAALt7TbM/nxuN8WxRUVF2y9mTHmX/LOf1+G/2u+TUqVPasmWLnnrqKVWtWtVu240eq/Dns/jZY7pVq1a6cOGCfvzxR0mSn5+fJGnNmjW6cOFCrvspiO9OALcXAhsAhzh37pxdOLra448/rhYtWuivf/2rAgMD1atXL3366ad5+gWkcuXKeZpgpFatWnbLNptNNWvWdPizkk6cOKHg4OAcn0f2ZXsnTpywW3/1L4KSVLZs2RsGjBMnTqhWrVpycbH/ar9WP7eqVatWysjIUGxsrOLi4pScnKxWrVqpdevWdoGtXr16KleunKQr9+dIV+51rFixot1r7dq1Sk5OlnTlMkJjjMaOHZuj3auvvipJVtu88PDw0OTJk7Vq1SoFBgaqdevWmjJlihITE/O0n9DQ0JtuezN/nydOnFDNmjVztMtt3c04fPiwjDGqVatWjs/vhx9+yPHZValSJUfgyK3GGjVq5GiXnxqv/kzKli0rSTcdoq/+Wa5Ro4ZcXFysn+W8Hv/NfpdkB8r69evfVJ1/dvDgQXXv3l1+fn7y9fVVxYoVrZCcfX9aaGioRowYoX//+9+qUKGCwsPD9e6779rdv1YQ350Abi/cwwagwP3yyy9KTU297i9yXl5e2rJlizZu3KiVK1dq9erV+uSTT9SuXTutXbv2ps5e5OW+s5t1rf8lz8zMLLQZAK/Vj7lqgpKi1qxZM3l6emrLli2qWrWqAgICVLt2bbVq1Urvvfee0tPT9dVXX6l79+7We7J/qfzoo48UFBSUY5/ZM4lmtxs1apTCw8Nz7T+/YWbYsGHq0qWLli9frjVr1mjs2LGKjo7Whg0bdPfdd9/UPvIy9ori7zMrK0s2m02rVq3Ktf+r7ykr7BoLur+rf27zevyO+C75s5SUFD3wwAPy9fXVa6+9pho1asjT01PfffedRo8ebRe23nrrLfXv31+ff/651q5dq+eff966V69KlSoF8t0J4PZCYANQ4D766CNJuuYv2tlcXFzUvn17tW/fXtOmTdMbb7yhl19+WRs3blSHDh1ueIlRXmWf3clmjNGRI0fUsGFDa13ZsmWVkpKS470nTpzQHXfcYS3npbZq1app3bp1Onv2rN1ZtuzLoArquV3VqlXTvn37lJWVZXeW7Vb7udaxZl8299VXX6lq1arWZW2tWrVSenq6Fi5cqKSkJLVu3dp6T/alngEBAerQocM1+8z+rEuVKnXddter73pq1KihkSNHauTIkTp8+LAaN26st956y5rZtKDH3o1Uq1Yt11kSb2bmxNzUqFFDxhiFhoaqdu3at1qepCs1fv/99zLG2H0+udXo6M/v8OHDdmc5jxw5oqysLGvCGkccv/R/4/LAgQN5et+mTZv0+++/a+nSpXY/D8eOHcu1fYMGDdSgQQO98sor+uabb9SiRQvNmTNHr7/+uqQbf3cCKF64JBJAgdqwYYMmTpyo0NDQ695/c/r06Rzrsu/fSU9PlySVLl1aknINUPmxYMECu/vqPvvsMyUkJFiz4ElXftH79ttvdenSJWvdihUrckwFnpfaOnfurMzMTM2aNctu/dtvvy2bzWbX/63o3LmzEhMT9cknn1jrMjIy9M4776hMmTJ64IEH8rXf0qVLX/M4W7VqpW3btmnjxo1WYKtQoYLq1q2ryZMnW22yhYeHy9fXV2+88YYuX76cY3+nTp2SdCXQtWnTRu+//74SEhKu2S67Punm/i4uXLiQY/r3GjVqyMfHxxp3NzpmRwgPD1dsbKzdg9hPnz6thQsX5mt/PXr0kKurqyZMmJDjrJUxRr///nu+avz111/tHqtw8eJF/etf/8rRtnTp0naX8RW0d9991275nXfekSTrZ8kRxy9duf+wdevW+vDDDxUfH59jv9eSfdbrz20uXbqk9957z65dWlqaMjIy7NY1aNBALi4u1vi8me9OAMULZ9gA5NuqVav0448/KiMjQ0lJSdqwYYNiYmJUrVo1/e9//5Onp+c13/vaa69py5YtioiIULVq1ZScnKz33ntPVapUsSaoqFGjhvz9/TVnzhz5+PiodOnSat68eZ7uH/qzcuXKqWXLlhowYICSkpI0ffp01axZ027ShL/+9a/67LPP1LFjRz322GM6evSoPv74Y7tJQPJaW5cuXdS2bVu9/PLLOn78uBo1aqS1a9fq888/17Bhw3LsO78GDRqk999/X/3799euXbtUvXp1ffbZZ/r66681ffr0695TeD1NmzbVunXrNG3aNAUHBys0NNR6rlqrVq00adIk/fzzz3bBrHXr1nr//fdVvXp16zllkuTr66vZs2erb9++atKkiXr16qWKFSsqPj5eK1euVIsWLaxg++6776ply5Zq0KCBnn76ad1xxx1KSkpSbGysfvnlF+3du1fSlV9WXV1dNXnyZKWmpsrDw0Pt2rVTQEBAjmM5dOiQ2rdvr8cee0z16tWTm5ubli1bpqSkJPXq1cvumGfPnq3XX39dNWvWVEBAgNq1a5evz+9mvPjii/r444/14IMPasiQIda0/lWrVtXp06fzfMaqRo0aev311zVmzBgdP35c3bp1k4+Pj44dO6Zly5Zp0KBBGjVqVJ72+be//U2zZs3SE088oaFDh6pSpUpauHCh9XP+5xqbNm2qTz75RCNGjNA999yjMmXKqEuXLnnq73qOHTumhx9+WB07dlRsbKw+/vhjPfnkk2rUqJHDjj/bzJkz1bJlSzVp0kSDBg1SaGiojh8/rpUrV9oF7j+7//77VbZsWUVGRur555+XzWbTRx99lCPkbdiwQYMHD9ajjz6q2rVrKyMjQx999JFcXV3Vs2dPSTf33QmgmCnkWSkBFAPZU1Znv9zd3U1QUJB58MEHzYwZM+ymj8929bTj69evN127djXBwcHG3d3dBAcHmyeeeMIcOnTI7n2ff/65qVevnnFzc7ObRv+BBx4wd911V671XWta///85z9mzJgxJiAgwHh5eZmIiIgcU3Mbc2X68sqVKxsPDw/TokULs3Pnzhz7vF5tV0/rb8yV6eyHDx9ugoODTalSpUytWrXM1KlT7aYCN+bKtP5RUVE5arrW4waulpSUZAYMGGAqVKhg3N3dTYMGDXJ99EBepvX/8ccfTevWrY2Xl5eRZFdHWlqacXV1NT4+PnbT3H/88cdGkunbt2+u+9y4caMJDw83fn5+xtPT09SoUcP079/f7Ny5067d0aNHTb9+/UxQUJApVaqUqVy5svnLX/5iPvvsM7t2//rXv8wdd9xhXF1drzvF/2+//WaioqLMnXfeaUqXLm38/PxM8+bNzaeffmrXLjEx0URERBgfHx8jyfq7v94jLa41rX9un3Nu42n37t2mVatWxsPDw1SpUsVER0ebmTNnGkkmMTEx1+PJdvXPV7b//ve/pmXLlqZ06dKmdOnS5s477zRRUVEmLi7OrpbcfpZyG8c//fSTiYiIMF5eXqZixYpm5MiR5r///a+RZL799lur3blz58yTTz5p/P39jSRrP9k/i1dPW3/s2LFrPiYjt+P8/vvvzSOPPGJ8fHxM2bJlzeDBg80ff/xRoMd/PQcOHDDdu3c3/v7+xtPT09SpU8eMHTvW2p7bWPj666/NfffdZ7y8vExwcLB58cUXzZo1a+zG608//WSeeuopU6NGDePp6WnKlStn2rZta9atW2ft52a/OwEUHzZjnOwudgAAIOnKBCnvv/++zp0757STSUyfPl3Dhw/XL7/8osqVKzu0r/Hjx2vChAk6deqUKlSo4NC+AMBZcA8bAABO4I8//rBb/v333/XRRx+pZcuWThPWrq7x4sWLev/991WrVi2HhzUAKKm4hw0AACcQFhamNm3aqG7dukpKStIHH3ygtLQ0jR07tqhLs/To0UNVq1ZV48aNlZqaqo8//lg//vhjvidHAQDcGIENAAAn0LlzZ3322Wf65z//KZvNpiZNmuiDDz6wmwa+qIWHh+vf//63Fi5cqMzMTNWrV0+LFy/W448/XtSlAUCxxT1sAAAAAOCkuIcNAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQCAG0hPT9fo0aMVHBwsLy8vNW/eXDExMUVdFgCgBCCwAQBwA/3799e0adPUu3dvzZgxQ66ururcubO2bt1a1KUBAIo5mzHGFHURAAA4q+3bt6t58+aaOnWqRo0aJUm6ePGi6tevr4CAAH3zzTdFXCEAoDjjDBsAANfx2WefydXVVYMGDbLWeXp6auDAgYqNjdXPP/9chNUBAIo7AhsAANexe/du1a5dW76+vnbr7733XknSnj17iqAqAEBJQWADAOA6EhISVKlSpRzrs9edPHmysEsCAJQgBDYAAK7jjz/+kIeHR471np6e1nYAAByFwAYAwHV4eXkpPT09x/qLFy9a2wEAcBQCGwAA11GpUiUlJCTkWJ+9Ljg4uLBLAgCUIAQ2AACuo3Hjxjp06JDS0tLs1m/bts3aDgCAoxDYAAC4jkceeUSZmZn65z//aa1LT0/X3Llz1bx5c4WEhBRhdQCA4s6tqAsAAMCZNW/eXI8++qjGjBmj5ORk1axZU/Pnz9fx48f1wQcfFHV5AIBizmaMMUVdBAAAzuzixYsaO3asPv74Y505c0YNGzbUxIkTFR4eXtSlAQCKOQIbAAAAADgp7mEDAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ8Vz2G5CVlaWTp48KR8fH9lstqIuBwAAAEARMcbo7NmzCg4OlouL489/EdhuwsmTJxUSElLUZQAAAABwEj///LOqVKni8H4IbDfBx8dH0pW/FF9f3yKuBgAAAEBRSUtLU0hIiJURHI3AdhOyL4P09fUlsAEAAAAotFulmHQEAAAAAJwUgQ0AAAAAnBSBDQAAAACcFPewAQAAoFjLzMzU5cuXi7oM3Ebc3d0LZcr+m0FgAwAAQLFkjFFiYqJSUlKKuhTcZlxcXBQaGip3d/eiLoXABgAAgOIpO6wFBATI29u70Gb1w+0tKytLJ0+eVEJCgqpWrVrk44bABgAAgGInMzPTCmvly5cv6nJwm6lYsaJOnjypjIwMlSpVqkhrIbCVIG/HHMpT++EP1nZQJQAAAI6Vfc+at7d3EVeC21H2pZCZmZlFHtic4046AAAAwAGK+nI23J6cadwQ2AAAAADASRHYAAAAACfSpk0bDRs2rKjLkCRt2rRJNpvNITNtjh8/XoGBgbLZbFq+fHmB77+44B42AAAAlCh5va//Vt0u8wK0adNGjRs31vTp0x3e1w8//KAJEyZo2bJluu+++1S2bNkcbTZt2qS2bdvqzJkz8vf3L7C+HbVfRyGwAQAAAChUR48elSR17drVqe4Xc0ZcEgkAAAA4sfT0dI0aNUqVK1dW6dKl1bx5c23atMnaPm/ePPn7+2vNmjWqW7euypQpo44dOyohIcFqk5GRoeeff17+/v4qX768Ro8ercjISHXr1k2S1L9/f23evFkzZsyQzWaTzWbT8ePHrffv2rVLzZo1k7e3t+6//37FxcVdt+b9+/erXbt28vLyUvny5TVo0CCdO3dO0pVLIbt06SLpygOqcwtsx48fV9u2bSVJZcuWlc1mU//+/SVdeU5adHS0QkND5eXlpUaNGumzzz6TdOVh6R06dFB4eLiMMZKk06dPq0qVKho3btx19+usCGwAAACAExs8eLBiY2O1ePFi7du3T48++qg6duyow4cPW20uXLigf/zjH/roo4+0ZcsWxcfHa9SoUdb2yZMna+HChZo7d66+/vprpaWl2d03NmPGDIWFhenpp59WQkKCEhISFBISYm1/+eWX9dZbb2nnzp1yc3PTU089dc16z58/r/DwcJUtW1Y7duzQkiVLtG7dOg0ePFiSNGrUKM2dO1eSrL6uFhISov/+97+SpLi4OCUkJGjGjBmSpOjoaC1YsEBz5szRwYMHNXz4cPXp00ebN2+WzWbT/PnztWPHDs2cOVOS9Mwzz6hy5coaN27cdffrrLgkEgAAAHBS8fHxmjt3ruLj4xUcHCzpSuBZvXq15s6dqzfeeEPSlefOzZkzRzVq1JB0JeS99tpr1n7eeecdjRkzRt27d5ckzZo1S19++aW13c/PT+7u7vL29lZQUFCOOiZNmqQHHnhAkvTSSy8pIiJCFy9elKenZ462ixYt0sWLF7VgwQKVLl3a6q9Lly6aPHmyAgMDrXvHcutLklxdXVWuXDlJUkBAgNU+PT1db7zxhtatW6ewsDBJ0h133KGtW7fq/fff1wMPPKDKlSvr/fffV79+/ZSYmKgvv/xSu3fvlpvbleiT236dGYENAAAAcFL79+9XZmamate2n7gkPT1d5cuXt5a9vb2tsCZJlSpVUnJysiQpNTVVSUlJuvfee63trq6uatq0qbKysm6qjoYNG9rtW5KSk5NVtWrVHG1/+OEHNWrUyAprktSiRQtlZWUpLi5OgYGBN9Vnbo4cOaILFy7owQcftFt/6dIl3X333dbyo48+qmXLlunNN9/U7NmzVatWrXz3WdQIbAAAAICTOnfunFxdXbVr1y65urrabStTpoz151KlStlts9ls1j1cBeHP+8++5+xmw15Byr4PbuXKlapcubLdNg8PD+vPFy5csD6zP186ejsisAEAAABO6u6771ZmZqaSk5PVqlWrfO3Dz89PgYGB2rFjh1q3bi1JyszM1HfffafGjRtb7dzd3ZWZmXnLNdetW1fz5s3T+fPnrbNsX3/9tVxcXFSnTp2b3o+7u7tVa7Z69erJw8ND8fHx1iWauRk5cqRcXFy0atUqde7cWREREWrXrt019+vMmHQEAAAAcFK1a9dW79691a9fPy1dulTHjh3T9u3bFR0drZUrV970foYMGaLo6Gh9/vnniouL09ChQ3XmzBm7GRqrV6+ubdu26fjx4/rtt9/yfQatd+/e8vT0VGRkpA4cOKCNGzdqyJAh6tu3b54uh6xWrZpsNptWrFihU6dO6dy5c/Lx8dGoUaM0fPhwzZ8/X0ePHtV3332nd955R/Pnz5d05ezbhx9+qIULF+rBBx/UCy+8oMjISJ05c+aa+3VmRRrYoqOjdc8998jHx0cBAQHq1q1bjilC27RpY00tmv165pln7NrEx8crIiJC3t7eCggI0AsvvKCMjAy7Nps2bVKTJk3k4eGhmjVrat68eY4+PAAAAOCWzZ07V/369dPIkSNVp04ddevWTTt27Mj1/rFrGT16tJ544gn169dPYWFhKlOmjMLDw+0mDRk1apRcXV1Vr149VaxYUfHx8fmq19vbW2vWrNHp06d1zz336JFHHlH79u01a9asPO2ncuXKmjBhgl566SUFBgZas0xOnDhRY8eOVXR0tOrWrauOHTtq5cqVCg0N1alTpzRw4ECNHz9eTZo0kSRNmDBBgYGBVoa41n6dlc0U5MWtedSxY0f16tVL99xzjzIyMvT3v/9dBw4c0Pfff2+dPm3Tpo1q165tN8uNt7e3fH19JV05ldm4cWMFBQVp6tSpSkhIUL9+/fT0009bs+YcO3ZM9evX1zPPPKO//vWvWr9+vYYNG6aVK1cqPDz8hnWmpaXJz89PqampVr+3o7djDuWp/fAHa9+4EQAAgBO6ePGijh07ptDQ0FxnMizpsrKyVLduXT322GOaOHFiUZfjdK43fgo7GxTpPWyrV6+2W543b54CAgK0a9cu6/paSdecXlSS1q5dq++//17r1q1TYGCgGjdurIkTJ2r06NEaP3683N3dNWfOHIWGhuqtt96SdOW62q1bt+rtt9++qcAGAAAA3M5OnDihtWvX6oEHHlB6erpmzZqlY8eO6cknnyzq0nADTnUPW2pqqqT/ezZCtoULF6pChQqqX7++xowZowsXLljbYmNj1aBBA7vrYcPDw5WWlqaDBw9abTp06GC3z/DwcMXGxuZaR3p6utLS0uxeAAAAwO3KxcVF8+bN0z333KMWLVpo//79WrdunerWrVvUpeEGnGaWyKysLA0bNkwtWrRQ/fr1rfVPPvmkqlWrpuDgYO3bt0+jR49WXFycli5dKklKTEzMcfNi9nJiYuJ126SlpemPP/6Ql5eX3bbo6GhNmDChwI8RAAAAKAohISH6+uuvi7oM5IPTBLaoqCgdOHBAW7dutVs/aNAg688NGjRQpUqV1L59ex09etTu4YAFacyYMRoxYoS1nJaWppCQEIf05cy45w0AAAAoWk5xSeTgwYO1YsUKbdy4UVWqVLlu2+bNm0u68pRzSQoKClJSUpJdm+zl7PvertXG19c3x9k16cpD93x9fe1eAAAAAFDYijSwGWM0ePBgLVu2TBs2bFBoaOgN37Nnzx5JUqVKlSRJYWFh2r9/v5KTk602MTEx8vX1Vb169aw269evt9tPTEyMwsLCCuhIAAAA4Izy+ywxlGxFOJF+DkV6SWRUVJQWLVqkzz//XD4+PtY9Z35+fvLy8tLRo0e1aNEide7cWeXLl9e+ffs0fPhwtW7dWg0bNpQkPfTQQ6pXr5769u2rKVOmKDExUa+88oqioqLk4eEhSXrmmWc0a9Ysvfjii3rqqae0YcMGffrpp3l62CAAAABuH+7u7nJxcdHJkydVsWJFubu72z0kGrgWY4xOnTolm82mUqVKFXU5Rfsctmv90MydO1f9+/fXzz//rD59+ujAgQM6f/68QkJC1L17d73yyit2lymeOHFCzz77rDZt2qTSpUsrMjJSb775ptzc/i+Pbtq0ScOHD9f333+vKlWqaOzYserfv/9N1VlSn8OWV9zDBgAAnMmlS5eUkJBgN8M4cDNsNpuqVKmiMmXK5NhW2NmgSAPb7cJZA5ujA1heEdgAAICzMcYoIyNDmZmZRV0KbiOlSpWSq6trrttK1IOzAQAAAEfKvqzNGS5tA/LDKWaJBAAAAADkRGADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ0VgAwAAAAAnVaSBLTo6Wvfcc498fHwUEBCgbt26KS4uzq7NxYsXFRUVpfLly6tMmTLq2bOnkpKS7NrEx8crIiJC3t7eCggI0AsvvKCMjAy7Nps2bVKTJk3k4eGhmjVrat68eY4+PAAAAAC4JUUa2DZv3qyoqCh9++23iomJ0eXLl/XQQw/p/PnzVpvhw4friy++0JIlS7R582adPHlSPXr0sLZnZmYqIiJCly5d0jfffKP58+dr3rx5GjdunNXm2LFjioiIUNu2bbVnzx4NGzZMf/3rX7VmzZpCPV4AAAAAyAubMcYUdRHZTp06pYCAAG3evFmtW7dWamqqKlasqEWLFumRRx6RJP3444+qW7euYmNjdd9992nVqlX6y1/+opMnTyowMFCSNGfOHI0ePVqnTp2Su7u7Ro8erZUrV+rAgQNWX7169VJKSopWr16do4709HSlp6dby2lpaQoJCVFqaqp8fX0d/CncvLdjDhV1Cbdk+IO1i7oEAAAAIE/S0tLk5+dXaNnAqe5hS01NlSSVK1dOkrRr1y5dvnxZHTp0sNrceeedqlq1qmJjYyVJsbGxatCggRXWJCk8PFxpaWk6ePCg1ebP+8huk72Pq0VHR8vPz896hYSEFNxBAgAAAMBNcprAlpWVpWHDhqlFixaqX7++JCkxMVHu7u7y9/e3axsYGKjExESrzZ/DWvb27G3Xa5OWlqY//vgjRy1jxoxRamqq9fr5558L5BgBAAAAIC/cirqAbFFRUTpw4IC2bt1a1KXIw8NDHh4eRV0GAAAAgBLOKc6wDR48WCtWrNDGjRtVpUoVa31QUJAuXbqklJQUu/ZJSUkKCgqy2lw9a2T28o3a+Pr6ysvLq6APBwAAAAAKRJEGNmOMBg8erGXLlmnDhg0KDQ212960aVOVKlVK69evt9bFxcUpPj5eYWFhkqSwsDDt379fycnJVpuYmBj5+vqqXr16Vps/7yO7TfY+AAAAAMAZFeklkVFRUVq0aJE+//xz+fj4WPec+fn5ycvLS35+fho4cKBGjBihcuXKydfXV0OGDFFYWJjuu+8+SdJDDz2kevXqqW/fvpoyZYoSExP1yiuvKCoqyrqs8ZlnntGsWbP04osv6qmnntKGDRv06aefauXKlUV27AAAAABwI0V6hm327NlKTU1VmzZtVKlSJev1ySefWG3efvtt/eUvf1HPnj3VunVrBQUFaenSpdZ2V1dXrVixQq6urgoLC1OfPn3Ur18/vfbaa1ab0NBQrVy5UjExMWrUqJHeeust/fvf/1Z4eHihHi8AAAAA5IVTPYfNWRX2sxZuFs9hAwAAAApXiX4OGwAAAADg/xDYAAAAAMBJ5Suw/fTTTwVdBwAAAADgKvkKbDVr1lTbtm318ccf6+LFiwVdEwAAAABA+Qxs3333nRo2bKgRI0YoKChIf/vb37R9+/aCrg0AAAAASrR8BbbGjRtrxowZOnnypD788EMlJCSoZcuWql+/vqZNm6ZTp04VdJ0AAAAAUOLc0qQjbm5u6tGjh5YsWaLJkyfryJEjGjVqlEJCQtSvXz8lJCQUVJ0AAAAAUOLcUmDbuXOnnnvuOVWqVEnTpk3TqFGjdPToUcXExOjkyZPq2rVrQdUJAAAAACWOW37eNG3aNM2dO1dxcXHq3LmzFixYoM6dO8vF5Ur+Cw0N1bx581S9evWCrBUAAAAASpR8BbbZs2frqaeeUv/+/VWpUqVc2wQEBOiDDz64peIAAAAAoCTLV2A7fPjwDdu4u7srMjIyP7sHAAAAACif97DNnTtXS5YsybF+yZIlmj9//i0XBQAAAADIZ2CLjo5WhQoVcqwPCAjQG2+8cctFAQAAAADyGdji4+MVGhqaY321atUUHx9/y0UBAAAAAPIZ2AICArRv374c6/fu3avy5cvfclEAAAAAgHwGtieeeELPP/+8Nm7cqMzMTGVmZmrDhg0aOnSoevXqVdA1AgAAAECJlK9ZIidOnKjjx4+rffv2cnO7sousrCz169ePe9gAAAAAoIDkK7C5u7vrk08+0cSJE7V37155eXmpQYMGqlatWkHXBwAAAAAlVr4CW7batWurdu3aBVULAAAAAOBP8hXYMjMzNW/ePK1fv17JycnKysqy275hw4YCKQ4AAAAASrJ8BbahQ4dq3rx5ioiIUP369WWz2Qq6LgAAAAAo8fIV2BYvXqxPP/1UnTt3Luh6AAAAAAD/X76m9Xd3d1fNmjULuhYAAAAAwJ/kK7CNHDlSM2bMkDGmoOsBAAAAAPx/+bokcuvWrdq4caNWrVqlu+66S6VKlbLbvnTp0gIpDgAAAABKsnwFNn9/f3Xv3v2WO9+yZYumTp2qXbt2KSEhQcuWLVO3bt2s7f3799f8+fPt3hMeHq7Vq1dby6dPn9aQIUP0xRdfyMXFRT179tSMGTNUpkwZq82+ffsUFRWlHTt2qGLFihoyZIhefPHFW64ft+btmEN5aj/8QR4hAQAAgJIlX4Ft7ty5BdL5+fPn1ahRIz311FPq0aNHrm06duxo15+Hh4fd9t69eyshIUExMTG6fPmyBgwYoEGDBmnRokWSpLS0ND300EPq0KGD5syZo/379+upp56Sv7+/Bg0aVCDHAQAAAACOkO8HZ2dkZGjTpk06evSonnzySfn4+OjkyZPy9fW1O7t1PZ06dVKnTp2u28bDw0NBQUG5bvvhhx+0evVq7dixQ82aNZMkvfPOO+rcubP+8Y9/KDg4WAsXLtSlS5f04Ycfyt3dXXfddZf27NmjadOmEdgAAAAAOLV8TTpy4sQJNWjQQF27dlVUVJROnTolSZo8ebJGjRpVoAVu2rRJAQEBqlOnjp599ln9/vvv1rbY2Fj5+/tbYU2SOnToIBcXF23bts1q07p1a7m7u1ttwsPDFRcXpzNnzuTaZ3p6utLS0uxeAAAAAFDY8hXYhg4dqmbNmunMmTPy8vKy1nfv3l3r168vsOI6duyoBQsWaP369Zo8ebI2b96sTp06KTMzU5KUmJiogIAAu/e4ubmpXLlySkxMtNoEBgbatclezm5ztejoaPn5+VmvkJCQAjsmAAAAALhZ+bok8quvvtI333xjd9ZKkqpXr65ff/21QAqTpF69ell/btCggRo2bKgaNWpo06ZNat++fYH1c7UxY8ZoxIgR1nJaWhqhDQAAAEChy9cZtqysLOss15/98ssv8vHxueWiruWOO+5QhQoVdOTIEUlSUFCQkpOT7dpkZGTo9OnT1n1vQUFBSkpKsmuTvXyte+M8PDzk6+tr9wIAAACAwpavwPbQQw9p+vTp1rLNZtO5c+f06quvqnPnzgVVWw6//PKLfv/9d1WqVEmSFBYWppSUFO3atctqs2HDBmVlZal58+ZWmy1btujy5ctWm5iYGNWpU0dly5Z1WK0AAAAAcKvyFdjeeustff3116pXr54uXryoJ5980roccvLkyTe9n3PnzmnPnj3as2ePJOnYsWPas2eP4uPjde7cOb3wwgv69ttvdfz4ca1fv15du3ZVzZo1FR4eLkmqW7euOnbsqKefflrbt2/X119/rcGDB6tXr14KDg6WJD355JNyd3fXwIEDdfDgQX3yySeaMWOG3SWPAAAAAOCMbMYYk583ZmRkaPHixdq3b5/OnTunJk2aqHfv3naTkNzIpk2b1LZt2xzrIyMjNXv2bHXr1k27d+9WSkqKgoOD9dBDD2nixIl2k4icPn1agwcPtntw9syZM6/54OwKFSpoyJAhGj169E3XmZaWJj8/P6WmpjrV5ZF5ffD07Y4HZwMAAKCoFXY2yHdgK0kIbM6BwAYAAICiVtjZIF+zRC5YsOC62/v165evYgAAAAAA/ydfgW3o0KF2y5cvX9aFCxfk7u4ub29vAhsAAAAAFIB8TTpy5swZu9e5c+cUFxenli1b6j//+U9B1wgAAAAAJVK+zrDlplatWnrzzTfVp08f/fjjjwW1W8CS13v2uOcNAAAAt7t8nWG7Fjc3N508ebIgdwkAAAAAJVa+zrD973//s1s2xighIUGzZs1SixYtCqQwAAAAACjp8hXYunXrZrdss9lUsWJFtWvXTm+99VZB1AUAAAAAJV6+AltWVlZB1wEAAAAAuEqB3sMGAAAAACg4+TrDNmLEiJtuO23atPx0AQAAAAAlXr4C2+7du7V7925dvnxZderUkSQdOnRIrq6uatKkidXOZrMVTJUAAAAAUALlK7B16dJFPj4+mj9/vsqWLSvpysO0BwwYoFatWmnkyJEFWiQAAAAAlET5uoftrbfeUnR0tBXWJKls2bJ6/fXXmSUSAAAAAApIvgJbWlqaTp06lWP9qVOndPbs2VsuCgAAAACQz8DWvXt3DRgwQEuXLtUvv/yiX375Rf/97381cOBA9ejRo6BrBAAAAIASKV/3sM2ZM0ejRo3Sk08+qcuXL1/ZkZubBg4cqKlTpxZogQAAAABQUuUrsHl7e+u9997T1KlTdfToUUlSjRo1VLp06QItDgAAAABKslt6cHZCQoISEhJUq1YtlS5dWsaYgqoLAAAAAEq8fAW233//Xe3bt1ft2rXVuXNnJSQkSJIGDhzIlP4AAAAAUEDyFdiGDx+uUqVKKT4+Xt7e3tb6xx9/XKtXry6w4gAAAACgJMvXPWxr167VmjVrVKVKFbv1tWrV0okTJwqkMAAAAAAo6fJ1hu38+fN2Z9aynT59Wh4eHrdcFAAAAAAgn4GtVatWWrBggbVss9mUlZWlKVOmqG3btgVWHAAAAACUZPm6JHLKlClq3769du7cqUuXLunFF1/UwYMHdfr0aX399dcFXSMAAAAAlEj5OsNWv359HTp0SC1btlTXrl11/vx59ejRQ7t371aNGjUKukYAAAAAKJHyfIbt8uXL6tixo+bMmaOXX37ZETUBAAAAAJSPM2ylSpXSvn37CqTzLVu2qEuXLgoODpbNZtPy5cvtthtjNG7cOFWqVEleXl7q0KGDDh8+bNfm9OnT6t27t3x9feXv76+BAwfq3Llzdm327dunVq1aydPTUyEhIZoyZUqB1A8AAAAAjpSvSyL79OmjDz744JY7P3/+vBo1aqR333031+1TpkzRzJkzNWfOHG3btk2lS5dWeHi4Ll68aLXp3bu3Dh48qJiYGK1YsUJbtmzRoEGDrO1paWl66KGHVK1aNe3atUtTp07V+PHj9c9//vOW6wcAAAAAR8rXpCMZGRn68MMPtW7dOjVt2lSlS5e22z5t2rSb2k+nTp3UqVOnXLcZYzR9+nS98sor6tq1qyRpwYIFCgwM1PLly9WrVy/98MMPWr16tXbs2KFmzZpJkt555x117txZ//jHPxQcHKyFCxfq0qVL+vDDD+Xu7q677rpLe/bs0bRp0+yCHQAAAAA4mzydYfvpp5+UlZWlAwcOqEmTJvLx8dGhQ4e0e/du67Vnz54CKezYsWNKTExUhw4drHV+fn5q3ry5YmNjJUmxsbHy9/e3wpokdejQQS4uLtq2bZvVpnXr1nJ3d7fahIeHKy4uTmfOnMm17/T0dKWlpdm9AAAAAKCw5ekMW61atZSQkKCNGzdKkh5//HHNnDlTgYGBBV5YYmKiJOXYd2BgoLUtMTFRAQEBdtvd3NxUrlw5uzahoaE59pG9rWzZsjn6jo6O1oQJEwrmQAAAAAAgn/J0hs0YY7e8atUqnT9/vkALcgZjxoxRamqq9fr555+LuiQAAAAAJVC+7mHLdnWAK0hBQUGSpKSkJFWqVMlan5SUpMaNG1ttkpOT7d6XkZGh06dPW+8PCgpSUlKSXZvs5ew2V/Pw8JCHh0eBHEdevB1zqND7BAAAAOC88nSGzWazyWaz5VjnCKGhoQoKCtL69eutdWlpadq2bZvCwsIkSWFhYUpJSdGuXbusNhs2bFBWVpaaN29utdmyZYsuX75stYmJiVGdOnVyvRwSAAAAAJxFns6wGWPUv39/6+zTxYsX9cwzz+SYJXLp0qU3tb9z587pyJEj1vKxY8e0Z88elStXTlWrVtWwYcP0+uuvq1atWgoNDdXYsWMVHBysbt26SZLq1q2rjh076umnn9acOXN0+fJlDR48WL169VJwcLAk6cknn9SECRM0cOBAjR49WgcOHNCMGTP09ttv5+XQAQAAAKDQ5SmwRUZG2i336dPnljrfuXOn2rZtay2PGDHC6mfevHl68cUXdf78eQ0aNEgpKSlq2bKlVq9eLU9PT+s9Cxcu1ODBg9W+fXu5uLioZ8+emjlzprXdz89Pa9euVVRUlJo2baoKFSpo3LhxTOkPAAAAwOnZjCNvRCsm0tLS5Ofnp9TUVPn6+jqsH+5hK1jDH6xd1CUAAACgmCmsbJDtliYdAZxZXgMwAQ8AAADOJk+TjgAAAAAACg+BDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJyUW1EXADiLt2MO5an98AdrO6gSAAAA4ArOsAEAAACAkyKwAQAAAICTIrABAAAAgJNy6sA2fvx42Ww2u9edd95pbb948aKioqJUvnx5lSlTRj179lRSUpLdPuLj4xURESFvb28FBATohRdeUEZGRmEfCgAAAADkmdNPOnLXXXdp3bp11rKb2/+VPHz4cK1cuVJLliyRn5+fBg8erB49eujrr7+WJGVmZioiIkJBQUH65ptvlJCQoH79+qlUqVJ64403Cv1YAAAAACAvnD6wubm5KSgoKMf61NRUffDBB1q0aJHatWsnSZo7d67q1q2rb7/9Vvfdd5/Wrl2r77//XuvWrVNgYKAaN26siRMnavTo0Ro/frzc3d0L+3AAAAAA4KY59SWRknT48GEFBwfrjjvuUO/evRUfHy9J2rVrly5fvqwOHTpYbe+8805VrVpVsbGxkqTY2Fg1aNBAgYGBVpvw8HClpaXp4MGD1+wzPT1daWlpdi8AAAAAKGxOfYatefPmmjdvnurUqaOEhARNmDBBrVq10oEDB5SYmCh3d3f5+/vbvScwMFCJiYmSpMTERLuwlr09e9u1REdHa8KECbdUe16f6QUAAAAAV3PqwNapUyfrzw0bNlTz5s1VrVo1ffrpp/Ly8nJYv2PGjNGIESOs5bS0NIWEhDisPwAAAADIjdNfEvln/v7+ql27to4cOaKgoCBdunRJKSkpdm2SkpKse96CgoJyzBqZvZzbfXHZPDw85Ovra/cCAAAAgMJ2WwW2c+fO6ejRo6pUqZKaNm2qUqVKaf369db2uLg4xcfHKywsTJIUFham/fv3Kzk52WoTExMjX19f1atXr9DrBwAAAIC8cOpLIkeNGqUuXbqoWrVqOnnypF599VW5urrqiSeekJ+fnwYOHKgRI0aoXLly8vX11ZAhQxQWFqb77rtPkvTQQw+pXr166tu3r6ZMmaLExES98sorioqKkoeHRxEfHQAAAABcn1MHtl9++UVPPPGEfv/9d1WsWFEtW7bUt99+q4oVK0qS3n77bbm4uKhnz55KT09XeHi43nvvPev9rq6uWrFihZ599lmFhYWpdOnSioyM1GuvvVZUhwQAAAAAN81mjDFFXYSzS0tLk5+fn1JTU2/6fjZmiSz+hj9Yu6hLAAAAQCHLTza4FbfVPWwAAAAAUJIQ2AAAAADASRHYAAAAAMBJEdgAAAAAwEkR2AAAAADASTn1tP6AM8vrTKDMKgkAAIC84gwbAAAAADgpAhsAAAAAOCkCGwAAAAA4KQIbAAAAADgpAhsAAAAAOClmicyDdzcckWfpMkVdBgAAAIASgsAGFBIeAwAAAIC84pJIAAAAAHBSBDYAAAAAcFIENgAAAABwUgQ2AAAAAHBSBDYAAAAAcFIENgAAAABwUgQ2AAAAAHBSBDYAAAAAcFI8OBtwUjxoGwAAAJxhAwAAAAAnRWADAAAAACfFJZFAMZHXSyglLqMEAABwdiXqDNu7776r6tWry9PTU82bN9f27duLuiQAAAAAuKYSc4btk08+0YgRIzRnzhw1b95c06dPV3h4uOLi4hQQEFDU5QFFgolNAAAAnFuJOcM2bdo0Pf300xowYIDq1aunOXPmyNvbWx9++GFRlwYAAAAAuSoRZ9guXbqkXbt2acyYMdY6FxcXdejQQbGxsTnap6enKz093VpOTU2VJF28cM7xxQJOLHr5d0Vdgp2odjXz1P7dDUccVEn+5LV+AABQ9NLS0iRJxphC6a9EBLbffvtNmZmZCgwMtFsfGBioH3/8MUf76OhoTZgwIcf613o/4LAaAeTd34u6gFt0u9cPAEBJ9vvvv8vPz8/h/ZSIwJZXY8aM0YgRI6zllJQUVatWTfHx8YXyl4KSKy0tTSEhIfr555/l6+tb1OWgGGOsobAw1lBYGGsoLKmpqapatarKlStXKP2ViMBWoUIFubq6KikpyW59UlKSgoKCcrT38PCQh4dHjvV+fn58AaBQ+Pr6MtZQKBhrKCyMNRQWxhoKi4tL4UwHUiImHXF3d1fTpk21fv16a11WVpbWr1+vsLCwIqwMAAAAAK6tRJxhk6QRI0YoMjJSzZo107333qvp06fr/PnzGjBgQFGXBgAAAAC5KjGB7fHHH9epU6c0btw4JSYmqnHjxlq9enWOiUhy4+HhoVdffTXXyySBgsRYQ2FhrKGwMNZQWBhrKCyFPdZsprDmowQAAAAA5EmJuIcNAAAAAG5HBDYAAAAAcFIENgAAAABwUgQ2AAAAAHBSBDYAAAAAcFIEtpvw7rvvqnr16vL09FTz5s21ffv2oi4Jt5Ho6Gjdc8898vHxUUBAgLp166a4uDi7NhcvXlRUVJTKly+vMmXKqGfPnkpKSrJrEx8fr4iICHl7eysgIEAvvPCCMjIyCvNQcJt58803ZbPZNGzYMGsdYw0F5ddff1WfPn1Uvnx5eXl5qUGDBtq5c6e13RijcePGqVKlSvLy8lKHDh10+PBhu32cPn1avXv3lq+vr/z9/TVw4ECdO3eusA8FTiwzM1Njx45VaGiovLy8VKNGDU2cOFF/nuScsYb82LJli7p06aLg4GDZbDYtX77cbntBjat9+/apVatW8vT0VEhIiKZMmZL3Yg2ua/Hixcbd3d18+OGH5uDBg+bpp582/v7+JikpqahLw20iPDzczJ071xw4cMDs2bPHdO7c2VStWtWcO3fOavPMM8+YkJAQs379erNz505z3333mfvvv9/anpGRYerXr286dOhgdu/ebb788ktToUIFM2bMmKI4JNwGtm/fbqpXr24aNmxohg4daq1nrKEgnD592lSrVs3079/fbNu2zfz0009mzZo15siRI1abN9980/j5+Znly5ebvXv3mocfftiEhoaaP/74w2rTsWNH06hRI/Ptt9+ar776ytSsWdM88cQTRXFIcFKTJk0y5cuXNytWrDDHjh0zS5YsMWXKlDEzZsyw2jDWkB9ffvmlefnll83SpUuNJLNs2TK77QUxrlJTU01gYKDp3bu3OXDggPnPf/5jvLy8zPvvv5+nWglsN3DvvfeaqKgoazkzM9MEBweb6OjoIqwKt7Pk5GQjyWzevNkYY0xKSoopVaqUWbJkidXmhx9+MJJMbGysMebKl4qLi4tJTEy02syePdv4+vqa9PT0wj0AOL2zZ8+aWrVqmZiYGPPAAw9YgY2xhoIyevRo07Jly2tuz8rKMkFBQWbq1KnWupSUFOPh4WH+85//GGOM+f77740ks2PHDqvNqlWrjM1mM7/++qvjisdtJSIiwjz11FN263r06GF69+5tjGGsoWBcHdgKaly99957pmzZsnb/fo4ePdrUqVMnT/VxSeR1XLp0Sbt27VKHDh2sdS4uLurQoYNiY2OLsDLczlJTUyVJ5cqVkyTt2rVLly9fthtnd955p6pWrWqNs9jYWDVo0ECBgYFWm/DwcKWlpengwYOFWD1uB1FRUYqIiLAbUxJjDQXnf//7n5o1a6ZHH31UAQEBuvvuu/Wvf/3L2n7s2DElJibajTU/Pz81b97cbqz5+/urWbNmVpsOHTrIxcVF27ZtK7yDgVO7//77tX79eh06dEiStHfvXm3dulWdOnWSxFiDYxTUuIqNjVXr1q3l7u5utQkPD1dcXJzOnDlz0/W43eoBFWe//fabMjMz7X5xkaTAwED9+OOPRVQVbmdZWVkaNmyYWrRoofr160uSEhMT5e7uLn9/f7u2gYGBSkxMtNrkNg6ztwHZFi9erO+++047duzIsY2xhoLy008/afbs2RoxYoT+/ve/a8eOHXr++efl7u6uyMhIa6zkNpb+PNYCAgLstru5ualcuXKMNVheeuklpaWl6c4775Srq6syMzM1adIk9e7dW5IYa3CIghpXiYmJCg0NzbGP7G1ly5a9qXoIbEAhioqK0oEDB7R169aiLgXF0M8//6yhQ4cqJiZGnp6eRV0OirGsrCw1a9ZMb7zxhiTp7rvv1oEDBzRnzhxFRkYWcXUoTj799FMtXLhQixYt0l133aU9e/Zo2LBhCg4OZqyhxOCSyOuoUKGCXF1dc8yglpSUpKCgoCKqCrerwYMHa8WKFdq4caOqVKlirQ8KCtKlS5eUkpJi1/7P4ywoKCjXcZi9DZCuXPKYnJysJk2ayM3NTW5ubtq8ebNmzpwpNzc3BQYGMtZQICpVqqR69erZratbt67i4+Ml/d9Yud6/n0FBQUpOTrbbnpGRodOnTzPWYHnhhRf00ksvqVevXmrQoIH69u2r4cOHKzo6WhJjDY5RUOOqoP5NJbBdh7u7u5o2bar169db67KysrR+/XqFhYUVYWW4nRhjNHjwYC1btkwbNmzIcWq8adOmKlWqlN04i4uLU3x8vDXOwsLCtH//frsvhpiYGPn6+ub4pQklV/v27bV//37t2bPHejVr1ky9e/e2/sxYQ0Fo0aJFjseTHDp0SNWqVZMkhYaGKigoyG6spaWladu2bXZjLSUlRbt27bLabNiwQVlZWWrevHkhHAVuBxcuXJCLi/2vq66ursrKypLEWINjFNS4CgsL05YtW3T58mWrTUxMjOrUqXPTl0NKYlr/G1m8eLHx8PAw8+bNM99//70ZNGiQ8ff3t5tBDbieZ5991vj5+ZlNmzaZhIQE63XhwgWrzTPPPGOqVq1qNmzYYHbu3GnCwsJMWFiYtT17qvWHHnrI7Nmzx6xevdpUrFiRqdZxQ3+eJdIYxhoKxvbt242bm5uZNGmSOXz4sFm4cKHx9vY2H3/8sdXmzTffNP7+/ubzzz83+/btM127ds11Suy7777bbNu2zWzdutXUqlWLqdZhJzIy0lSuXNma1n/p0qWmQoUK5sUXX7TaMNaQH2fPnjW7d+82u3fvNpLMtGnTzO7du82JEyeMMQUzrlJSUkxgYKDp27evOXDggFm8eLHx9vZmWn9HeOedd0zVqlWNu7u7uffee823335b1CXhNiIp19fcuXOtNn/88Yd57rnnTNmyZY23t7fp3r27SUhIsNvP8ePHTadOnYyXl5epUKGCGTlypLl8+XIhHw1uN1cHNsYaCsoXX3xh6tevbzw8PMydd95p/vnPf9ptz8rKMmPHjjWBgYHGw8PDtG/f3sTFxdm1+f33380TTzxhypQpY3x9fc2AAQPM2bNnC/Mw4OTS0tLM0KFDTdWqVY2np6e54447zMsvv2w3TTpjDfmxcePGXH8/i4yMNMYU3Ljau3evadmypfHw8DCVK1c2b775Zp5rtRnzp0fFAwAAO+fOndPUqVO1bds2bd++XWfOnNHcuXPVv3//oi4NAFACcA8bAADX8dtvv+m1117TDz/8oEaNGhV1OQCAEoZp/QEAuI5KlSopISFBQUFB2rlzp+65556iLgkAUIJwhg0AgOvw8PBg6m8AQJEhsAEAAACAkyKwAQAAAICTIrABAAAAgJMisAEAAACAkyKwAQAAAICTIrABAAAAgJMisAEAAACAk+LB2QAA3MCsWbOUkpKikydPSpK++OIL/fLLL5KkIUOGyM/PryjLAwAUYzZjjCnqIgAAcGbVq1fXiRMnct127NgxVa9evXALAgCUGAQ2AAAAAHBS3MMGAAAAAE6KwAYAAAAATorABgAAAABOisAGAAAAAE6KwAYAAAAATornsN2ErKwsnTx5Uj4+PrLZbEVdDgAAAIAiYozR2bNnFRwcLBcXx5//IrDdhJMnTyokJKSoywAAAADgJH7++WdVqVLF4f0Q2G6Cj4+PpCt/Kb6+vkVcDQAAAICikpaWppCQECsjOBqB7SZkXwbp6+tLYAMAAABQaLdKMekIAAAAADgpAhsAAAAAOCkCGwAAAAA4Ke5hAwAAQLGWmZmpy5cvF3UZuI24u7sXypT9N4PABgAAgGLJGKPExESlpKQUdSm4zbi4uCg0NFTu7u5FXQqBDQAAAMVTdlgLCAiQt7d3oc3qh9tbVlaWTp48qYSEBFWtWrXIxw2BDQAAAMVOZmamFdbKly9f1OXgNlOxYkWdPHlSGRkZKlWqVJHWQmBDDm/HHLJbHv5g7SKqBAAAIH+y71nz9vYu4kpwO8q+FDIzM7PIA5tz3EkHAAAAOEBRX86G25MzjRsCGwAAAAA4KQJbCfR2zCG7FwAAAJxHmzZtNGzYsKIuQ5K0adMm2Ww2h8y0OX78eAUGBspms2n58uUFvv/ignvYAAAAUKIU9n9Y3y7zAbRp00aNGzfW9OnTHd7XDz/8oAkTJmjZsmW67777VLZs2RxtNm3apLZt2+rMmTPy9/cvsL4dtV9HIbABAAAAKFRHjx6VJHXt2tWp7hdzRlwSCQAAADix9PR0jRo1SpUrV1bp0qXVvHlzbdq0ydo+b948+fv7a82aNapbt67KlCmjjh07KiEhwWqTkZGh559/Xv7+/ipfvrxGjx6tyMhIdevWTZLUv39/bd68WTNmzJDNZpPNZtPx48et9+/atUvNmjWTt7e37r//fsXFxV235v3796tdu3by8vJS+fLlNWjQIJ07d07SlUshu3TpIunKA6pzC2zHjx9X27ZtJUlly5aVzWZT//79JV15Tlp0dLRCQ0Pl5eWlRo0a6bPPPpN05WHpHTp0UHh4uIwxkqTTp0+rSpUqGjdu3HX366wIbAAAAIATGzx4sGJjY7V48WLt27dPjz76qDp27KjDhw9bbS5cuKB//OMf+uijj7RlyxbFx8dr1KhR1vbJkydr4cKFmjt3rr7++mulpaXZ3Tc2Y8YMhYWF6emnn1ZCQoISEhIUEhJibX/55Zf11ltvaefOnXJzc9NTTz11zXrPnz+v8PBwlS1bVjt27NCSJUu0bt06DR48WJI0atQozZ07V5Ksvq4WEhKi//73v5KkuLg4JSQkaMaMGZKk6OhoLViwQHPmzNHBgwc1fPhw9enTR5s3b5bNZtP8+fO1Y8cOzZw5U5L0zDPPqHLlyho3btx19+usuCQSAAAAcFLx8fGaO3eu4uPjFRwcLOlK4Fm9erXmzp2rN954Q9KV587NmTNHNWrUkHQl5L322mvWft555x2NGTNG3bt3lyTNmjVLX375pbXdz89P7u7u8vb2VlBQUI46Jk2apAceeECS9NJLLykiIkIXL16Up6dnjraLFi3SxYsXtWDBApUuXdrqr0uXLpo8ebICAwOte8dy60uSXF1dVa5cOUlSQECA1T49PV1vvPGG1q1bp7CwMEnSHXfcoa1bt+r999/XAw88oMqVK+v9999Xv379lJiYqC+//FK7d++Wm9uV6JPbfp0ZgQ0AAABwUvv371dmZqZq17afuCQ9PV3ly5e3lr29va2wJkmVKlVScnKyJCk1NVVJSUm69957re2urq5q2rSpsrKybqqOhg0b2u1bkpKTk1W1atUcbX/44Qc1atTICmuS1KJFC2VlZSkuLk6BgYE31Wdujhw5ogsXLujBBx+0W3/p0iXdfffd1vKjjz6qZcuW6c0339Ts2bNVq1atfPdZ1AhsAAAAgJM6d+6cXF1dtWvXLrm6utptK1OmjPXnUqVK2W2z2WzWPVwF4c/7z77n7GbDXkHKvg9u5cqVqly5st02Dw8P688XLlywPrM/Xzp6OyKwAQAAAE7q7rvvVmZmppKTk9WqVat87cPPz0+BgYHasWOHWrduLUnKzMzUd999p8aNG1vt3N3dlZmZecs1161bV/PmzdP58+ets2xff/21XFxcVKdOnZvej7u7u1Vrtnr16snDw0Px8fHWJZq5GTlypFxcXLRq1Sp17txZERERateu3TX368yYdAQAAABwUrVr11bv3r3Vr18/LV26VMeOHdP27dsVHR2tlStX3vR+hgwZoujoaH3++eeKi4vT0KFDdebMGbsZGqtXr65t27bp+PHj+u233/J9Bq13797y9PRUZGSkDhw4oI0bN2rIkCHq27dvni6HrFatmmw2m1asWKFTp07p3Llz8vHx0ahRozR8+HDNnz9fR48e1Xfffad33nlH8+fPl3Tl7NuHH36ohQsX6sEHH9QLL7ygyMhInTlz5pr7dWYENgAAAMCJzZ07V/369dPIkSNVp04ddevWTTt27Mj1/rFrGT16tJ544gn169dPYWFhKlOmjMLDw+0mDRk1apRcXV1Vr149VaxYUfHx8fmq19vbW2vWrNHp06d1zz336JFHHlH79u01a9asPO2ncuXKmjBhgl566SUFBgZas0xOnDhRY8eOVXR0tOrWrauOHTtq5cqVCg0N1alTpzRw4ECNHz9eTZo0kSRNmDBBgYGBeuaZZ667X2dlMwV5cWsxlZaWJj8/P6WmpsrX17eoy7llb8ccslse/mDtPG0HAABwdhcvXtSxY8cUGhqa60yGJV1WVpbq1q2rxx57TBMnTizqcpzO9cZPYWcD7mEDAAAAirkTJ05o7dq1euCBB5Senq5Zs2bp2LFjevLJJ4u6NNwAl0QCAAAAxZyLi4vmzZune+65Ry1atND+/fu1bt061a1bt6hLww1whg0AAAAo5kJCQvT1118XdRnIB86wAQAAAICT4gwbckwyAgAAAMA5cIYNAAAAxVZ+nyWGks2ZJtLnDBsAAACKHXd3d7m4uOjkyZOqWLGi3N3d7R4SDVyLMUanTp2SzWZTqVKlirocAhsAAACKHxcXF4WGhiohIUEnT54s6nJwm7HZbKpSpYpcXV2LuhQCG/KHh2sDAABn5+7urqpVqyojI0OZmZlFXQ5uI6VKlXKKsCYR2AAAAFCMZV/W5gyXtgH5waQjAAAAAOCkCGwAAAAA4KS4JLKY4d4yAAAAoPjgDBsAAAAAOCkCGwAAAAA4KQIbAAAAADgpAhsAAAAAOCkCGwAAAAA4KQIbAAAAADgpAhsAAAAAOCkCGwAAAAA4KQIbAAAAADgppw5ss2fPVsOGDeXr6ytfX1+FhYVp1apV1vaLFy8qKipK5cuXV5kyZdSzZ08lJSXZ7SM+Pl4RERHy9vZWQECAXnjhBWVkZBT2oQAAAABAnjl1YKtSpYrefPNN7dq1Szt37lS7du3UtWtXHTx4UJI0fPhwffHFF1qyZIk2b96skydPqkePHtb7MzMzFRERoUuXLumbb77R/PnzNW/ePI0bN66oDgkAAAAAbppbURdwPV26dLFbnjRpkmbPnq1vv/1WVapU0QcffKBFixapXbt2kqS5c+eqbt26+vbbb3Xfffdp7dq1+v7777Vu3ToFBgaqcePGmjhxokaPHq3x48fL3d29KA4LAAAAAG6KU59h+7PMzEwtXrxY58+fV1hYmHbt2qXLly+rQ4cOVps777xTVatWVWxsrCQpNjZWDRo0UGBgoNUmPDxcaWlp1lm63KSnpystLc3uBQAAAACFzekD2/79+1WmTBl5eHjomWee0bJly1SvXj0lJibK3d1d/v7+du0DAwOVmJgoSUpMTLQLa9nbs7ddS3R0tPz8/KxXSEhIwR4UAAAAANwEpw9sderU0Z49e7Rt2zY9++yzioyM1Pfff+/QPseMGaPU1FTr9fPPPzu0PwAAAADIjVPfwyZJ7u7uqlmzpiSpadOm2rFjh2bMmKHHH39cly5dUkpKit1ZtqSkJAUFBUmSgoKCtH37drv9Zc8imd0mNx4eHvLw8CjgI7l9vR1zqKhLAAAAAEokpz/DdrWsrCylp6eradOmKlWqlNavX29ti4uLU3x8vMLCwiRJYWFh2r9/v5KTk602MTEx8vX1Vb169Qq9dgAAAADIC6c+wzZmzBh16tRJVatW1dmzZ7Vo0SJt2rRJa9askZ+fnwYOHKgRI0aoXLly8vX11ZAhQxQWFqb77rtPkvTQQw+pXr166tu3r6ZMmaLExES98sorioqK4gwaAAAAAKfn1IEtOTlZ/fr1U0JCgvz8/NSwYUOtWbNGDz74oCTp7bfflouLi3r27Kn09HSFh4frvffes97v6uqqFStW6Nlnn1VYWJhKly6tyMhIvfbaa0V1SAAAAABw02zGGFPURTi7tLQ0+fn5KTU1Vb6+vkVdznVdfb/Z8Adr37BNQcitHwAAAKC4KexscNvdwwYAAAAAJQWBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACclFNP649b54gZIQEAAAAUDs6wAQAAAICTIrABAAAAgJMisAEAAACAkyKwAQAAAICTYtIRFIirJzcZ/mDtIqoEAAAAKD44wwYAAAAATorABgAAAABOisAGAAAAAE6KwAYAAAAATsphge2nn35y1K4BAAAAoERwWGCrWbOm2rZtq48//lgXL150VDcAAAAAUGw5LLB99913atiwoUaMGKGgoCD97W9/0/bt2x3VHQAAAAAUOw4LbI0bN9aMGTN08uRJffjhh0pISFDLli1Vv359TZs2TadOnXJU1wAAAABQLDh80hE3Nzf16NFDS5Ys0eTJk3XkyBGNGjVKISEh6tevnxISEhxdAgAAAADclhwe2Hbu3KnnnntOlSpV0rRp0zRq1CgdPXpUMTExOnnypLp27eroEgAAAADgtuTmqB1PmzZNc+fOVVxcnDp37qwFCxaoc+fOcnG5khFDQ0M1b948Va9e3VElAAAAAMBtzWGBbfbs2XrqqafUv39/VapUKdc2AQEB+uCDDxxVAgAAAADc1hwW2A4fPnzDNu7u7oqMjHRUCQAAAABwW3PYPWxz587VkiVLcqxfsmSJ5s+f76huAQAAAKDYcFhgi46OVoUKFXKsDwgI0BtvvOGobgEAAACg2HDYJZHx8fEKDQ3Nsb5atWqKj493VLdwEm/HHLJbHv5g7SKqBAAAALh9OewMW0BAgPbt25dj/d69e1W+fHlHdQsAAAAAxYbDAtsTTzyh559/Xhs3blRmZqYyMzO1YcMGDR06VL169XJUtwAAAABQbDjsksiJEyfq+PHjat++vdzcrnSTlZWlfv36cQ8bAAAAANwEhwU2d3d3ffLJJ5o4caL27t0rLy8vNWjQQNWqVXNUlwAAAABQrDgssGWrXbu2atdmwgkAAAAAyCuHBbbMzEzNmzdP69evV3JysrKysuy2b9iwwVFdAwAAAECx4LDANnToUM2bN08RERGqX7++bDabo7oCAAAAgGLJYYFt8eLF+vTTT9W5c2dHdQEAAAAAxZrDpvV3d3dXzZo1HbV7AAAAACj2HBbYRo4cqRkzZsgY46guAAAAAKBYc9glkVu3btXGjRu1atUq3XXXXSpVqpTd9qVLlzqqawAAAAAoFhwW2Pz9/dW9e3dH7R4AAAAAij2HBba5c+c6atcAAAAAUCI47B42ScrIyNC6dev0/vvv6+zZs5KkkydP6ty5c47sFgAAAACKBYedYTtx4oQ6duyo+Ph4paen68EHH5SPj48mT56s9PR0zZkzx1FdAwAAAECx4LAzbEOHDlWzZs105swZeXl5Weu7d++u9evX39Q+oqOjdc8998jHx0cBAQHq1q2b4uLi7NpcvHhRUVFRKl++vMqUKaOePXsqKSnJrk18fLwiIiLk7e2tgIAAvfDCC8rIyLj1gwQAAAAAB3JYYPvqq6/0yiuvyN3d3W599erV9euvv97UPjZv3qyoqCh9++23iomJ0eXLl/XQQw/p/PnzVpvhw4friy++0JIlS7R582adPHlSPXr0sLZnZmYqIiJCly5d0jfffKP58+dr3rx5GjduXMEcKAAAAAA4iMMuiczKylJmZmaO9b/88ot8fHxuah+rV6+2W543b54CAgK0a9cutW7dWqmpqfrggw+0aNEitWvXTtKVyU7q1q2rb7/9Vvfdd5/Wrl2r77//XuvWrVNgYKAaN26siRMnavTo0Ro/fnyOQHm7eTvmUFGXAAAAAMBBHHaG7aGHHtL06dOtZZvNpnPnzunVV19V586d87XP1NRUSVK5cuUkSbt27dLly5fVoUMHq82dd96pqlWrKjY2VpIUGxurBg0aKDAw0GoTHh6utLQ0HTx4MNd+0tPTlZaWZvcCAAAAgMLmsMD21ltv6euvv1a9evV08eJFPfnkk9blkJMnT87z/rKysjRs2DC1aNFC9evXlyQlJibK3d1d/v7+dm0DAwOVmJhotflzWMvenr0tN9HR0fLz87NeISEhea4XAAAAAG6Vwy6JrFKlivbu3avFixdr3759OnfunAYOHKjevXvbTUJys6KionTgwAFt3brVAdXaGzNmjEaMGGEtp6WlEdoAAAAAFDqHBTZJcnNzU58+fW55P4MHD9aKFSu0ZcsWValSxVofFBSkS5cuKSUlxe4sW1JSkoKCgqw227dvt9tf9iyS2W2u5uHhIQ8Pj1uuGwAAAABuhcMC24IFC667vV+/fjfchzFGQ4YM0bJly7Rp0yaFhobabW/atKlKlSql9evXq2fPnpKkuLg4xcfHKywsTJIUFhamSZMmKTk5WQEBAZKkmJgY+fr6ql69evk5NAAAAAAoFA4LbEOHDrVbvnz5si5cuCB3d3d5e3vfVGCLiorSokWL9Pnnn8vHx8e658zPz09eXl7y8/PTwIEDNWLECJUrV06+vr4aMmSIwsLCdN9990m6MvlJvXr11LdvX02ZMkWJiYl65ZVXFBUVxVk0AAAAAE7NYYHtzJkzOdYdPnxYzz77rF544YWb2sfs2bMlSW3atLFbP3fuXPXv31+S9Pbbb8vFxUU9e/ZUenq6wsPD9d5771ltXV1dtWLFCj377LMKCwtT6dKlFRkZqddeey1/BwYAAAAAhcRmjDGF2eHOnTvVp08f/fjjj4XZ7S1JS0uTn5+fUlNT5evrW9Tl2LldnsM2/MHaRV0CAAAAcMsKOxs4bFr/a3Fzc9PJkycLu1sAAAAAuO047JLI//3vf3bLxhglJCRo1qxZatGihaO6BQAAAIBiw2GBrVu3bnbLNptNFStWVLt27fTWW285qlsAAAAAKDYcFtiysrIctWsAAAAAKBEK/R42AAAAAMDNcdgZthEjRtx022nTpjmqDAAAAAC4bTkssO3evVu7d+/W5cuXVadOHUnSoUOH5OrqqiZNmljtbDabo0qAk7v6kQRM/Q8AAADYc1hg69Kli3x8fDR//nyVLVtW0pWHaQ8YMECtWrXSyJEjHdU1AAAAABQLDruH7a233lJ0dLQV1iSpbNmyev3115klEgAAAABugsMCW1pamk6dOpVj/alTp3T27FlHdQsAAAAAxYbDAlv37t01YMAALV26VL/88ot++eUX/fe//9XAgQPVo0cPR3ULAAAAAMWGw+5hmzNnjkaNGqUnn3xSly9fvtKZm5sGDhyoqVOnOqpbAAAAACg2HBbYvL299d5772nq1Kk6evSoJKlGjRoqXbq0o7oEAAAAgGLFYYEtW0JCghISEtS6dWt5eXnJGMNU/iXQ1VP4AwAAALgxh93D9vvvv6t9+/aqXbu2OnfurISEBEnSwIEDmdIfAAAAAG6CwwLb8OHDVapUKcXHx8vb29ta//jjj2v16tWO6hYAAAAAig2HXRK5du1arVmzRlWqVLFbX6tWLZ04ccJR3QIAAABAseGwM2znz5+3O7OW7fTp0/Lw8HBUtwAAAABQbDgssLVq1UoLFiywlm02m7KysjRlyhS1bdvWUd0CAAAAQLHhsEsip0yZovbt22vnzp26dOmSXnzxRR08eFCnT5/W119/7ahuAQAAAKDYcNgZtvr16+vQoUNq2bKlunbtqvPnz6tHjx7avXu3atSo4ahuAQAAAKDYcMgZtsuXL6tjx46aM2eOXn75ZUd0AQAAAADFnkPOsJUqVUr79u1zxK4BAAAAoMRw2CWRffr00QcffOCo3QMAAABAseewSUcyMjL04Ycfat26dWratKlKly5tt33atGmO6hoAAAAAioUCD2w//fSTqlevrgMHDqhJkyaSpEOHDtm1sdlsBd0tAAAAABQ7BR7YatWqpYSEBG3cuFGS9Pjjj2vmzJkKDAws6K4AAAAAoFgr8HvYjDF2y6tWrdL58+cLuhsAAAAAKPYcNulItqsDHAAAAADg5hR4YLPZbDnuUeOeNQAAAADIuwK/h80Yo/79+8vDw0OSdPHiRT3zzDM5ZolcunRpQXcNAAAAAMVKgQe2yMhIu+U+ffoUdBcAAAAAUCIUeGCbO3duQe8Sf/J2zKEbNwIAAABQLDh80hEAAAAAQP4Q2AAAAADASRHYAAAAAMBJEdgAAAAAwEkR2AAAAADASRHYAAAAAMBJEdgAAAAAwEkV+HPYgPy6+hlzwx+sXUSVAAAAAM6BM2wAAAAA4KScPrBt2bJFXbp0UXBwsGw2m5YvX2633RijcePGqVKlSvLy8lKHDh10+PBhuzanT59W79695evrK39/fw0cOFDnzp0rxKMAAAAAgLxz+sB2/vx5NWrUSO+++26u26dMmaKZM2dqzpw52rZtm0qXLq3w8HBdvHjRatO7d28dPHhQMTExWrFihbZs2aJBgwYV1iEgn96OOWT3AgAAAEoap7+HrVOnTurUqVOu24wxmj59ul555RV17dpVkrRgwQIFBgZq+fLl6tWrl3744QetXr1aO3bsULNmzSRJ77zzjjp37qx//OMfCg4OLrRjAQAAAIC8cPozbNdz7NgxJSYmqkOHDtY6Pz8/NW/eXLGxsZKk2NhY+fv7W2FNkjp06CAXFxdt27Yt1/2mp6crLS3N7gUAAAAAhe22DmyJiYmSpMDAQLv1gYGB1rbExEQFBATYbXdzc1O5cuWsNleLjo6Wn5+f9QoJCXFA9QAAAABwfU5/SWRRGDNmjEaMGGEtp6WlEdqcQG73sTH1PwAAAIqz2/oMW1BQkCQpKSnJbn1SUpK1LSgoSMnJyXbbMzIydPr0aavN1Tw8POTr62v3AgAAAIDCdlsHttDQUAUFBWn9+vXWurS0NG3btk1hYWGSpLCwMKWkpGjXrl1Wmw0bNigrK0vNmzcv9JoBAAAA4GY5/SWR586d05EjR6zlY8eOac+ePSpXrpyqVq2qYcOG6fXXX1etWrUUGhqqsWPHKjg4WN26dZMk1a1bVx07dtTTTz+tOXPm6PLlyxo8eLB69erFDJEAAAAAnJrTB7adO3eqbdu21nL2vWWRkZGaN2+eXnzxRZ0/f16DBg1SSkqKWrZsqdWrV8vT09N6z8KFCzV48GC1b99eLi4u6tmzp2bOnFnoxwIAAAAAeWEzxpiiLsLZpaWlyc/PT6mpqUV+PxsPkLbHpCMAAAAoTIWdDZz+DBtwPVcHWAIcAAAAipPbetIRAAAAACjOCGwAAAAA4KQIbAAAAADgpAhsAAAAAOCkCGwAAAAA4KQIbAAAAADgpAhsAAAAAOCkCGwAAAAA4KQIbAAAAADgpNyKugCgIL0dc8huefiDtYuoEgAAAODWcYYNAAAAAJwUgQ0AAAAAnBSBDQAAAACcFPewObGr78cCAAAAULJwhg0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFIENAAAAAJwUgQ0AAAAAnBSBDQAAAACcFA/ORomS28PIhz9YuwgqAQAAAG6MM2wAAAAA4KQIbAAAAADgpLgkEsVabpdA5vU9XDIJAACAokJgcyL5CRcAAAAAii8uiQQAAAAAJ8UZNpR4NzqzySWSAAAAKCqcYQMAAAAAJ0VgAwAAAAAnRWADAAAAACdFYAMAAAAAJ8WkI0Ae5TZJCRORAAAAwBE4wwYAAAAATorABgAAAABOisAGAAAAAE6Ke9iK0I0e2Izbx40ers3DtwEAAJAfBDbAAW4Uxpm4BAAAADeDSyIBAAAAwElxhg1wElw2CQAAgKsR2AAnRYADAABAiQps7777rqZOnarExEQ1atRI77zzju69995C659JRgAAAADkRYkJbJ988olGjBihOXPmqHnz5po+fbrCw8MVFxengICAW94/Z0NQXDCWAQAAnIfNGGOKuojC0Lx5c91zzz2aNWuWJCkrK0shISEaMmSIXnrpJbu26enpSk9Pt5ZTU1NVtWpV/fzzz/L19c11/+9uOOK44oF8impX02756nF69fbc2txonwAAACVJWlqaQkJClJKSIj8/P4f3VyIC26VLl+Tt7a3PPvtM3bp1s9ZHRkYqJSVFn3/+uV378ePHa8KECYVcJQAAAIDbxdGjR3XHHXc4vJ8ScUnkb7/9pszMTAUGBtqtDwwM1I8//pij/ZgxYzRixAhrOSUlRdWqVVN8fHyhpGiUXNn/Y3O9s7lAQWCsobAw1lBYGGsoLNlX35UrV65Q+isRgS2vPDw85OHhkWO9n58fXwAoFL6+vow1FArGGgoLYw2FhbGGwuLiUjiPtC4RD86uUKGCXF1dlZSUZLc+KSlJQUFBRVQVAAAAAFxfiQhs7u7uatq0qdavX2+ty8rK0vr16xUWFlaElQEAAADAtZWYSyJHjBihyMhINWvWTPfee6+mT5+u8+fPa8CAATd8r4eHh1599dVcL5MEChJjDYWFsYbCwlhDYWGsobAU9lgrEbNEZps1a5b14OzGjRtr5syZat68eVGXBQAAAAC5KlGBDQAAAABuJyXiHjYAAAAAuB0R2AAAAADASRHYAAAAAMBJEdgAAAAAwEkR2G7Cu+++q+rVq8vT01PNmzfX9u3bi7ok3Eaio6N1zz33yMfHRwEBAerWrZvi4uLs2ly8eFFRUVEqX768ypQpo549e+Z40Ht8fLwiIiLk7e2tgIAAvfDCC8rIyCjMQ8Ft5s0335TNZtOwYcOsdYw1FJRff/1Vffr0Ufny5eXl5aUGDRpo586d1nZjjMaNG6dKlSrJy8tLHTp00OHDh+32cfr0afXu3Vu+vr7y9/fXwIEDde7cucI+FDixzMxMjR07VqGhofLy8lKNGjU0ceJE/XnOPMYa8mPLli3q0qWLgoODZbPZtHz5crvtBTWu9u3bp1atWsnT01MhISGaMmVK3os1uK7Fixcbd3d38+GHH5qDBw+ap59+2vj7+5ukpKSiLg23ifDwcDN37lxz4MABs2fPHtO5c2dTtWpVc+7cOavNM888Y0JCQsz69evNzp07zX333Wfuv/9+a3tGRoapX7++6dChg9m9e7f58ssvTYUKFcyYMWOK4pBwG9i+fbupXr26adiwoRk6dKi1nrGGgnD69GlTrVo1079/f7Nt2zbz008/mTVr1pgjR45Ybd58803j5+dnli9fbvbu3WsefvhhExoaav744w+rTceOHU2jRo3Mt99+a7766itTs2ZN88QTTxTFIcFJTZo0yZQvX96sWLHCHDt2zCxZssSUKVPGzJgxw2rDWEN+fPnll+bll182S5cuNZLMsmXL7LYXxLhKTU01gYGBpnfv3ubAgQPmP//5j/Hy8jLvv/9+nmolsN3Avffea6KioqzlzMxMExwcbKKjo4uwKtzOkpOTjSSzefNmY4wxKSkpplSpUmbJkiVWmx9++MFIMrGxscaYK18qLi4uJjEx0Woze/Zs4+vra9LT0wv3AOD0zp49a2rVqmViYmLMAw88YAU2xhoKyujRo03Lli2vuT0rK8sEBQWZqVOnWutSUlKMh4eH+c9//mOMMeb77783ksyOHTusNqtWrTI2m838+uuvjiset5WIiAjz1FNP2a3r0aOH6d27tzGGsYaCcXVgK6hx9d5775myZcva/fs5evRoU6dOnTzVxyWR13Hp0iXt2rVLHTp0sNa5uLioQ4cOio2NLcLKcDtLTU2VJJUrV06StGvXLl2+fNlunN15552qWrWqNc5iY2PVoEEDBQYGWm3Cw8OVlpamgwcPFmL1uB1ERUUpIiLCbkxJjDUUnP/9739q1qyZHn30UQUEBOjuu+/Wv/71L2v7sWPHlJiYaDfW/Pz81Lx5c7ux5u/vr2bNmlltOnToIBcXF23btq3wDgZO7f7779f69et16NAhSdLevXu1detWderUSRJjDY5RUOMqNjZWrVu3lru7u9UmPDxccXFxOnPmzE3X43arB1Sc/fbbb8rMzLT7xUWSAgMD9eOPPxZRVbidZWVladiwYWrRooXq168vSUpMTJS7u7v8/f3t2gYGBioxMdFqk9s4zN4GZFu8eLG+++477dixI8c2xhoKyk8//aTZs2drxIgR+vvf/64dO3bo+eefl7u7uyIjI62xkttY+vNYCwgIsNvu5uamcuXKMdZgeemll5SWlqY777xTrq6uyszM1KRJk9S7d29JYqzBIQpqXCUmJio0NDTHPrK3lS1b9qbqIbABhSgqKkoHDhzQ1q1bi7oUFEM///yzhg4dqpiYGHl6ehZ1OSjGsrKy1KxZM73xxhuSpLvvvlsHDhzQnDlzFBkZWcTVoTj59NNPtXDhQi1atEh33XWX9uzZo2HDhik4OJixhhKDSyKvo0KFCnJ1dc0xg1pSUpKCgoKKqCrcrgYPHqwVK1Zo48aNqlKlirU+KChIly5dUkpKil37P4+zoKCgXMdh9jZAunLJY3Jyspo0aSI3Nze5ublp8+bNmjlzptzc3BQYGMhYQ4GoVKmS6tWrZ7eubt26io+Pl/R/Y+V6/34GBQUpOTnZbntGRoZOnz7NWIPlhRde0EsvvaRevXqpQYMG6tu3r4YPH67o6GhJjDU4RkGNq4L6N5XAdh3u7u5q2rSp1q9fb63LysrS+vXrFRYWVoSV4XZijNHgwYO1bNkybdiwIcep8aZNm6pUqVJ24ywuLk7x8fHWOAsLC9P+/fvtvhhiYmLk6+ub45cmlFzt27fX/v37tWfPHuvVrFkz9e7d2/ozYw0FoUWLFjkeT3Lo0CFVq1ZNkhQaGqqgoCC7sZaWlqZt27bZjbWUlBTt2rXLarNhwwZlZWWpefPmhXAUuB1cuHBBLi72v666uroqKytLEmMNjlFQ4yosLExbtmzR5cuXrTYxMTGqU6fOTV8OKYlp/W9k8eLFxsPDw8ybN898//33ZtCgQcbf399uBjXgep599lnj5+dnNm3aZBISEqzXhQsXrDbPPPOMqVq1qtmwYYPZuXOnCQsLM2FhYdb27KnWH3roIbNnzx6zevVqU7FiRaZaxw39eZZIYxhrKBjbt283bm5uZtKkSebw4cNm4cKFxtvb23z88cdWmzfffNP4+/ubzz//3Ozbt8907do11ymx7777brNt2zazdetWU6tWLaZah53IyEhTuXJla1r/pUuXmgoVKpgXX3zRasNYQ36cPXvW7N692+zevdtIMtOmTTO7d+82J06cMMYUzLhKSUkxgYGBpm/fvubAgQNm8eLFxtvbm2n9HeGdd94xVatWNe7u7ubee+813377bVGXhNuIpFxfc+fOtdr88ccf5rnnnjNly5Y13t7epnv37iYhIcFuP8ePHzedOnUyXl5epkKFCmbkyJHm8uXLhXw0uN1cHdgYaygoX3zxhalfv77x8PAwd955p/nnP/9ptz0rK8uMHTvWBAYGGg8PD9O+fXsTFxdn1+b33383TzzxhClTpozx9fU1AwYMMGfPni3Mw4CTS0tLM0OHDjVVq1Y1np6e5o477jAvv/yy3TTpjDXkx8aNG3P9/SwyMtIYU3Djau/evaZly5bGw8PDVK5c2bz55pt5rtVmzJ8eFQ8AAAAAcBrcwwYAAAAATorABgAAAABOisAGAAAAAE6KwAYAAAAATorABgAAAABOisAGAAAAAE6KwAYAAAAATorABgAAAABOisAGAAAAAE6KwAYAAAAATorABgAAAABO6v8BZkQNQuZSjdAAAAAASUVORK5CYII=",
            "text/plain": [
              "<Figure size 1000x600 with 2 Axes>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "dataset_clean[\"length of text\"] = dataset_clean[\"text\"].map(len)\n",
        "ax = dataset_clean.plot.hist(column=[\"length of text\"], by=\"_label_\", bins=50, alpha=0.5, figsize=(10, 6), title=\"Distribution of tweet string length per class\", xlim=[0, 1000])"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "HgDwJos4FgJK"
      },
      "source": [
        "## Feature engineering"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 15,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "vX9IFcHYFi7f",
        "outputId": "754396a9-2faf-43a7-d1f7-ab687192093b"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Feature Engineering method: Binary (one hot encoding)\n"
          ]
        }
      ],
      "source": [
        "def feat_eng_text_df(in_df, text_col, labels_col, config_dict):\n",
        "  if \"CountVectorizer-binary\" == config_dict[\"feature_eng_details\"]:\n",
        "    print(\"Feature Engineering method: Binary (one hot encoding)\")\n",
        "    countvectorizer = CountVectorizer(ngram_range=(config_dict[\"ngram_range_min\"], config_dict[\"ngram_range_max\"]),\n",
        "                                      stop_words='english',\n",
        "                                      max_features=config_dict[\"max_features\"],\n",
        "                                      binary=True)\n",
        "\n",
        "  elif \"CountVectorizer-BOW\" == config_dict[\"feature_eng_details\"]:\n",
        "    print(\"Feature Engineering method: Bag of words\")\n",
        "    countvectorizer = CountVectorizer(ngram_range=(config_dict[\"ngram_range_min\"], config_dict[\"ngram_range_max\"]),\n",
        "                                      stop_words='english',\n",
        "                                      max_features=config_dict[\"max_features\"],\n",
        "                                      binary=False)\n",
        "\n",
        "  out_arr = countvectorizer.fit_transform(in_df[text_col])\n",
        "  count_tokens = countvectorizer.get_feature_names_out()\n",
        "  out_df = pd.DataFrame(data = out_arr.toarray(),columns = count_tokens)\n",
        "  out_df[labels_col] = list(in_df[labels_col])\n",
        "  return out_df\n",
        "\n",
        "\n",
        "if config_dict[\"do_feature_eng\"]:\n",
        "  dataset_feat_eng = feat_eng_text_df(dataset_clean, 'text', '_label_', config_dict)\n",
        "else:\n",
        "  # This option isn't being supported, the notebook would fail. This option is\n",
        "  # here to cater for a ML pipeline that uses deep learning language models that consume text, and not engineered features.\n",
        "  dataset_feat_eng = dataset_clean.copy()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "f_WMzIFM_i6A"
      },
      "source": [
        "## Exploring the new numerical features  "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 16,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 255
        },
        "id": "Aqz7oNbE_328",
        "outputId": "c2bf8090-578f-4667-b1d4-e74091bf3017"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "dataframe",
              "variable_name": "dataset_feat_eng"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-5a0a9c82-a7fe-4870-8275-fd9cd04fbce2\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>aapl</th>\n",
              "      <th>ab</th>\n",
              "      <th>access</th>\n",
              "      <th>according</th>\n",
              "      <th>acquire</th>\n",
              "      <th>acquires</th>\n",
              "      <th>acquisition</th>\n",
              "      <th>action</th>\n",
              "      <th>activity</th>\n",
              "      <th>actual</th>\n",
              "      <th>...</th>\n",
              "      <th>yy</th>\n",
              "      <th>zeroeight</th>\n",
              "      <th>zerofive</th>\n",
              "      <th>zerofour</th>\n",
              "      <th>zeroone</th>\n",
              "      <th>zeroseven</th>\n",
              "      <th>zerosix</th>\n",
              "      <th>zerothree</th>\n",
              "      <th>zerotwo</th>\n",
              "      <th>_label_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 1001 columns</p>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-5a0a9c82-a7fe-4870-8275-fd9cd04fbce2')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-5a0a9c82-a7fe-4870-8275-fd9cd04fbce2 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-5a0a9c82-a7fe-4870-8275-fd9cd04fbce2');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-2aa101ec-24bc-4431-a39b-af9f32f4583d\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-2aa101ec-24bc-4431-a39b-af9f32f4583d')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-2aa101ec-24bc-4431-a39b-af9f32f4583d button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "   aapl  ab  access  according  acquire  acquires  acquisition  action  \\\n",
              "0     0   0       0          0        0         0            0       0   \n",
              "1     0   0       0          0        0         0            0       0   \n",
              "2     0   0       0          0        0         0            0       0   \n",
              "3     0   0       0          0        0         0            0       0   \n",
              "4     0   0       0          0        0         0            0       0   \n",
              "\n",
              "   activity  actual  ...  yy  zeroeight  zerofive  zerofour  zeroone  \\\n",
              "0         0       0  ...   0          0         0         0        0   \n",
              "1         0       0  ...   0          0         0         0        0   \n",
              "2         0       0  ...   0          0         0         0        0   \n",
              "3         0       0  ...   0          0         0         0        0   \n",
              "4         0       0  ...   0          0         0         0        0   \n",
              "\n",
              "   zeroseven  zerosix  zerothree  zerotwo  _label_  \n",
              "0          0        0          0        0        0  \n",
              "1          0        0          0        0        0  \n",
              "2          0        0          0        0        0  \n",
              "3          0        0          0        0        0  \n",
              "4          0        0          0        0        0  \n",
              "\n",
              "[5 rows x 1001 columns]"
            ]
          },
          "execution_count": 16,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_feat_eng.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 17,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 193
        },
        "id": "8ZPYzjltUZ3Q",
        "outputId": "cb2b500c-1024-4e8e-e428-46417f964d41"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "dataframe"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-220c3e7a-0f9d-492d-a91f-d60d4a6caf9c\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>aapl</th>\n",
              "      <th>ab</th>\n",
              "      <th>access</th>\n",
              "      <th>according</th>\n",
              "      <th>acquire</th>\n",
              "      <th>acquires</th>\n",
              "      <th>acquisition</th>\n",
              "      <th>action</th>\n",
              "      <th>activity</th>\n",
              "      <th>actual</th>\n",
              "      <th>...</th>\n",
              "      <th>yy</th>\n",
              "      <th>zeroeight</th>\n",
              "      <th>zerofive</th>\n",
              "      <th>zerofour</th>\n",
              "      <th>zeroone</th>\n",
              "      <th>zeroseven</th>\n",
              "      <th>zerosix</th>\n",
              "      <th>zerothree</th>\n",
              "      <th>zerotwo</th>\n",
              "      <th>_label_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>min</th>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.00000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.000000</td>\n",
              "      <td>0.00000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>max</th>\n",
              "      <td>1.00000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>...</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.00000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.000000</td>\n",
              "      <td>1.00000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>mean</th>\n",
              "      <td>0.00379</td>\n",
              "      <td>0.002653</td>\n",
              "      <td>0.003459</td>\n",
              "      <td>0.016582</td>\n",
              "      <td>0.003316</td>\n",
              "      <td>0.004074</td>\n",
              "      <td>0.007344</td>\n",
              "      <td>0.007154</td>\n",
              "      <td>0.002701</td>\n",
              "      <td>0.003459</td>\n",
              "      <td>...</td>\n",
              "      <td>0.003316</td>\n",
              "      <td>0.002511</td>\n",
              "      <td>0.00379</td>\n",
              "      <td>0.004785</td>\n",
              "      <td>0.004501</td>\n",
              "      <td>0.003459</td>\n",
              "      <td>0.002653</td>\n",
              "      <td>0.003411</td>\n",
              "      <td>0.004548</td>\n",
              "      <td>0.20832</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>3 rows × 1001 columns</p>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-220c3e7a-0f9d-492d-a91f-d60d4a6caf9c')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-220c3e7a-0f9d-492d-a91f-d60d4a6caf9c button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-220c3e7a-0f9d-492d-a91f-d60d4a6caf9c');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-7b53b673-fd88-4d6f-b547-060c06e56cb3\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-7b53b673-fd88-4d6f-b547-060c06e56cb3')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-7b53b673-fd88-4d6f-b547-060c06e56cb3 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "         aapl        ab    access  according   acquire  acquires  acquisition  \\\n",
              "min   0.00000  0.000000  0.000000   0.000000  0.000000  0.000000     0.000000   \n",
              "max   1.00000  1.000000  1.000000   1.000000  1.000000  1.000000     1.000000   \n",
              "mean  0.00379  0.002653  0.003459   0.016582  0.003316  0.004074     0.007344   \n",
              "\n",
              "        action  activity    actual  ...        yy  zeroeight  zerofive  \\\n",
              "min   0.000000  0.000000  0.000000  ...  0.000000   0.000000   0.00000   \n",
              "max   1.000000  1.000000  1.000000  ...  1.000000   1.000000   1.00000   \n",
              "mean  0.007154  0.002701  0.003459  ...  0.003316   0.002511   0.00379   \n",
              "\n",
              "      zerofour   zeroone  zeroseven   zerosix  zerothree   zerotwo  _label_  \n",
              "min   0.000000  0.000000   0.000000  0.000000   0.000000  0.000000  0.00000  \n",
              "max   1.000000  1.000000   1.000000  1.000000   1.000000  1.000000  1.00000  \n",
              "mean  0.004785  0.004501   0.003459  0.002653   0.003411  0.004548  0.20832  \n",
              "\n",
              "[3 rows x 1001 columns]"
            ]
          },
          "execution_count": 17,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_feat_eng.describe().loc[['min', 'max', 'mean']]"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "Vf-JsLSKC-Vk"
      },
      "source": [
        "## Split to Train/Test"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 18,
      "metadata": {
        "id": "RQzV5OeY6RaZ"
      },
      "outputs": [],
      "source": [
        "dataset_feat_eng_test = dataset_feat_eng.sample(frac=config_dict[\"test_size\"],random_state=config_dict['seed'])\n",
        "dataset_feat_eng_train = dataset_feat_eng.drop(dataset_feat_eng_test.index)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "TfsphIcZFZEn"
      },
      "source": [
        "## Preliminary statistical analysis and feasibility study\n",
        "This process is perhaps the most valuable for the preliminary study prior to applying ML.  \n",
        "This is where we measure the relationship between \"X\" and \"Y\" so to see whether there is a \"correlation\".  \n",
        "\n",
        "If this were a regression problem, where X and Y are numerical, then it would make sense to evaluate the correlation between X and Y, so to learn whether one could expect a linear regression model to yield good results.\n",
        "\n",
        "Since neither X nor Y are numerical in their nature, we seek to evaluated the **statistical dependence** between them, so to know whether a model would have any \"signal\" to pick up on.  "
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "k-ZtgfrJ9nhl"
      },
      "source": [
        "Calc:  \n",
        "**P(feature | class)**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 19,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 206
        },
        "id": "HPvdzu9utqMC",
        "outputId": "cc40cbfd-3639-4d91-c04f-f8a53d17736b"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "summary": "{\n  \"name\": \"means_by_class\",\n  \"rows\": 1000,\n  \"fields\": [\n    {\n      \"column\": 0,\n      \"properties\": {\n        \"dtype\": \"number\",\n        \"std\": 0.00856660785998122,\n        \"min\": 0.0,\n        \"max\": 0.11301677651288196,\n        \"num_unique_values\": 234,\n        \"samples\": [\n          0.04665967645296585,\n          0.011908328340323546,\n          0.01145895745955662\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    },\n    {\n      \"column\": 1,\n      \"properties\": {\n        \"dtype\": \"number\",\n        \"std\": 0.007895807272059501,\n        \"min\": 0.0,\n        \"max\": 0.11035653650254669,\n        \"num_unique_values\": 92,\n        \"samples\": [\n          0.011035653650254669,\n          0.02801358234295416,\n          0.0050933786078098476\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
              "type": "dataframe",
              "variable_name": "means_by_class"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-0bb25bf4-af2f-4c56-bb83-8684171368cd\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th>_label_</th>\n",
              "      <th>0</th>\n",
              "      <th>1</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>aapl</th>\n",
              "      <td>0.004269</td>\n",
              "      <td>0.002830</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>ab</th>\n",
              "      <td>0.003146</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>access</th>\n",
              "      <td>0.002397</td>\n",
              "      <td>0.007357</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>according</th>\n",
              "      <td>0.019847</td>\n",
              "      <td>0.004527</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>acquire</th>\n",
              "      <td>0.004269</td>\n",
              "      <td>0.000000</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-0bb25bf4-af2f-4c56-bb83-8684171368cd')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-0bb25bf4-af2f-4c56-bb83-8684171368cd button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-0bb25bf4-af2f-4c56-bb83-8684171368cd');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-e3060b36-b8b6-4a0a-8181-7eab36b8e7ed\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-e3060b36-b8b6-4a0a-8181-7eab36b8e7ed')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-e3060b36-b8b6-4a0a-8181-7eab36b8e7ed button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "_label_           0         1\n",
              "aapl       0.004269  0.002830\n",
              "ab         0.003146  0.000000\n",
              "access     0.002397  0.007357\n",
              "according  0.019847  0.004527\n",
              "acquire    0.004269  0.000000"
            ]
          },
          "execution_count": 19,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "## Statistics of features per class:\n",
        "means_by_class = dataset_feat_eng_train.groupby(by=[\"_label_\"]).mean().T.sort_index()\n",
        "means_by_class.head()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "LINyT4DBAsf8"
      },
      "source": [
        "Calc the ratio that reflects statistical dependence:  \n",
        "**P(class, feature)/(P(class)P(feature))**  \n",
        "And note that it could be rewritten as:  \n",
        "**P(class | feature)/P(class)**  \n",
        "Or equivalently:  \n",
        "**P(feature | class)/P(feature)**  \n",
        "\n",
        "\\*Note:  \n",
        "The below calculation is assuming that the numerical features of each text term is **binary**, only then is the below a probability measure.  \n",
        "If another feature method is used, such as BoW or TF/IDF, then the below is not the probability, but a proxy of it.  "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 20,
      "metadata": {
        "id": "QwHZPJwuAssf"
      },
      "outputs": [],
      "source": [
        "P_class = sorted([[c, np.mean(dataset_feat_eng[\"_label_\"] == c)] for c in set(means_by_class.columns)])\n",
        "P_feature = sorted([[f, np.mean(dataset_feat_eng[f] > 0)] for f in dataset_feat_eng.columns if f != \"_label_\"])\n",
        "P_feature_inv = [[f, 1/p] for f, p in P_feature]\n",
        "\n",
        "P_class_arr = np.array(P_class)\n",
        "P_feature_arr = np.array(P_feature)\n",
        "P_feature_inv_arr = np.array(P_feature_inv)\n",
        "# Multiplying a \"column vector\" of feature probablities with a \"line vector\" of\n",
        "# class probilities to get a matrix where each element is a product of probabilities:\n",
        "P_class_prod_P_feature_inv_arr = np.outer(P_feature_inv_arr[:, 1].astype(float), P_class_arr[:, 1].astype(float))\n",
        "\n",
        "P_class_given_feature = means_by_class.copy()\n",
        "for feature_counter in range(len(P_class_given_feature)):\n",
        "  for c in P_class_given_feature.columns:\n",
        "    # Right hand side: P(feature | class) / P(feature)\n",
        "    P_class_given_feature[c][feature_counter] = means_by_class[c][feature_counter] / P_feature_arr[feature_counter, 1].astype(float)\n",
        "\n",
        "\n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "ipfL9o6XX8pE"
      },
      "source": [
        "**The terms that are most indicative of class \"0\":**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 21,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 363
        },
        "id": "bEl4iKwaX82G",
        "outputId": "6ba9112f-4e1c-4d9b-8150-c7b8a894388b"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "summary": "{\n  \"name\": \"P_class_given_feature\",\n  \"rows\": 10,\n  \"fields\": [\n    {\n      \"column\": 0,\n      \"properties\": {\n        \"dtype\": \"number\",\n        \"std\": 0.014088535506641792,\n        \"min\": 1.3720253908678794,\n        \"max\": 1.4114391637421895,\n        \"num_unique_values\": 9,\n        \"samples\": [\n          1.3734922747497766,\n          1.4029705287597363,\n          1.380427260989544\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
              "type": "dataframe"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-f1819fc8-2431-4e2f-9a48-172dd31f9b69\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th>_label_</th>\n",
              "      <th>0</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>latest updates</th>\n",
              "      <td>1.411439</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>gasoline</th>\n",
              "      <td>1.402971</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>spx esf</th>\n",
              "      <td>1.398410</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>encourages investors</th>\n",
              "      <td>1.393258</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>warning</th>\n",
              "      <td>1.390024</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>upside</th>\n",
              "      <td>1.380427</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>monthly distribution</th>\n",
              "      <td>1.375892</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>contact firm</th>\n",
              "      <td>1.373492</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>esf</th>\n",
              "      <td>1.373492</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>ago</th>\n",
              "      <td>1.372025</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f1819fc8-2431-4e2f-9a48-172dd31f9b69')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-f1819fc8-2431-4e2f-9a48-172dd31f9b69 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-f1819fc8-2431-4e2f-9a48-172dd31f9b69');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-90a342f6-ce70-4728-bc9b-180928dd866c\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-90a342f6-ce70-4728-bc9b-180928dd866c')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-90a342f6-ce70-4728-bc9b-180928dd866c button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "_label_                      0\n",
              "latest updates        1.411439\n",
              "gasoline              1.402971\n",
              "spx esf               1.398410\n",
              "encourages investors  1.393258\n",
              "warning               1.390024\n",
              "upside                1.380427\n",
              "monthly distribution  1.375892\n",
              "contact firm          1.373492\n",
              "esf                   1.373492\n",
              "ago                   1.372025"
            ]
          },
          "execution_count": 21,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "P_class_given_feature.sort_values([0], ascending=False)[[0]].head(10)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "MaZFPdu0ZZgW"
      },
      "source": [
        "**The terms that are most indicative of class \"1\":**"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 22,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 363
        },
        "id": "Hqyvb9KlZZyW",
        "outputId": "3ab4e96b-5d4a-45cc-b7a4-5dce8419b948"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "summary": "{\n  \"name\": \"P_class_given_feature\",\n  \"rows\": 10,\n  \"fields\": [\n    {\n      \"column\": 1,\n      \"properties\": {\n        \"dtype\": \"number\",\n        \"std\": 0.21660088553432122,\n        \"min\": 3.9817015657423127,\n        \"max\": 4.718646330672519,\n        \"num_unique_values\": 8,\n        \"samples\": [\n          4.386093131013017,\n          4.063798505242154,\n          4.718646330672519\n        ],\n        \"semantic_type\": \"\",\n        \"description\": \"\"\n      }\n    }\n  ]\n}",
              "type": "dataframe"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-a4598374-84a4-4daf-9ce1-9f033271376e\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th>_label_</th>\n",
              "      <th>1</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>launches</th>\n",
              "      <td>4.718646</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>solution</th>\n",
              "      <td>4.386093</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>partnership</th>\n",
              "      <td>4.316795</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>announcement form</th>\n",
              "      <td>4.221977</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>form eightthree</th>\n",
              "      <td>4.221977</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>uk regulatory</th>\n",
              "      <td>4.121902</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>regulatory announcement</th>\n",
              "      <td>4.121902</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>plc</th>\n",
              "      <td>4.063799</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>partner</th>\n",
              "      <td>4.029674</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>cloud</th>\n",
              "      <td>3.981702</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-a4598374-84a4-4daf-9ce1-9f033271376e')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-a4598374-84a4-4daf-9ce1-9f033271376e button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-a4598374-84a4-4daf-9ce1-9f033271376e');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-276c8b2e-e63d-43ae-a4cf-3c821139d4bd\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-276c8b2e-e63d-43ae-a4cf-3c821139d4bd')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-276c8b2e-e63d-43ae-a4cf-3c821139d4bd button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "_label_                         1\n",
              "launches                 4.718646\n",
              "solution                 4.386093\n",
              "partnership              4.316795\n",
              "announcement form        4.221977\n",
              "form eightthree          4.221977\n",
              "uk regulatory            4.121902\n",
              "regulatory announcement  4.121902\n",
              "plc                      4.063799\n",
              "partner                  4.029674\n",
              "cloud                    3.981702"
            ]
          },
          "execution_count": 22,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "P_class_given_feature.sort_values([1], ascending=False)[[1]].head(10)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "YyHadLWYchW7"
      },
      "source": [
        "## Feature selection\n",
        "This is a univariate feature selection process.  \n",
        "It is based on conditional dependence between a feature being 0/1 and a class being 0/1, thus the mean value of the feature is its probability.  \n",
        "Note that the process of feature selection is done **on the training set**.   \n"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "glryYgkh9uGr"
      },
      "source": [
        "For each class, choose the most indicative features.  \n",
        "Either maximize the:   \n",
        "* a-priori distribution P(feature | class), Max Liklihood  \n",
        "or  \n",
        "* a posteriori P(class | feature), MAP"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 23,
      "metadata": {
        "id": "bTnI3Lpk95BU"
      },
      "outputs": [],
      "source": [
        "chosen_features = []\n",
        "if config_dict[\"maximize_a_priori\"] == True:\n",
        "  classes = means_by_class.columns\n",
        "  for c in classes:\n",
        "    chosen_features += list(means_by_class[c].sort_values(ascending=False).index[:config_dict[\"num_chosen_features_per_class\"] + 1])\n",
        "else:\n",
        "  classes = P_class_given_feature.columns\n",
        "  for c in classes:\n",
        "    chosen_features += list(P_class_given_feature[c].sort_values(ascending=False).index[:config_dict[\"num_chosen_features_per_class\"] + 1])\n",
        "\n",
        "\n",
        "chosen_features = list(set(chosen_features))"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 24,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "QsRkvKXuAApE",
        "outputId": "9343b387-9082-4e42-9623-082d06bbefc1"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "['european',\n",
              " 'musk',\n",
              " 'day',\n",
              " 'technology',\n",
              " 'model',\n",
              " 'qtwo thousand',\n",
              " 'according',\n",
              " 'quarterly',\n",
              " 'stockmarket',\n",
              " 'fda',\n",
              " 'cloud',\n",
              " 'dividend',\n",
              " 'health',\n",
              " 'watch',\n",
              " 'twentytwo earnings',\n",
              " 'airlines',\n",
              " 'government',\n",
              " 'says',\n",
              " 'earnings',\n",
              " 'plans',\n",
              " 'eps',\n",
              " 'worlds',\n",
              " 'europe',\n",
              " 'ahead',\n",
              " 'operations',\n",
              " 'president',\n",
              " 'netflix',\n",
              " 'china',\n",
              " 'month',\n",
              " 'buy',\n",
              " 'low',\n",
              " 'solution',\n",
              " 'annual',\n",
              " 'qqq',\n",
              " 'elon musk',\n",
              " 'fix',\n",
              " 'russian',\n",
              " 'technologies',\n",
              " 'billion',\n",
              " 'tesla',\n",
              " 'tsla',\n",
              " 'futures',\n",
              " 'solutions',\n",
              " 'rise',\n",
              " 'million',\n",
              " 'increase',\n",
              " 'signs',\n",
              " 'future',\n",
              " 'shares',\n",
              " 'said',\n",
              " 'report',\n",
              " 'second',\n",
              " 'major',\n",
              " 'including',\n",
              " 'release',\n",
              " 'seven',\n",
              " 'fed',\n",
              " 'development',\n",
              " 'customer',\n",
              " 'product',\n",
              " 'financial results',\n",
              " 'portfolio',\n",
              " 'fast',\n",
              " 'finance',\n",
              " 'treatment',\n",
              " 'people',\n",
              " 'quarter thousand',\n",
              " 'recession',\n",
              " 'new',\n",
              " 'thousand',\n",
              " 'named',\n",
              " 'services',\n",
              " 'years',\n",
              " 'best',\n",
              " 'platform',\n",
              " 'series',\n",
              " 'energy',\n",
              " 'support',\n",
              " 'conference',\n",
              " 'companies',\n",
              " 'management',\n",
              " 'action',\n",
              " 'investor',\n",
              " 'june',\n",
              " 'thousand twentythree',\n",
              " 'united',\n",
              " 'eu',\n",
              " 'update',\n",
              " 'declares',\n",
              " 'august',\n",
              " 'results',\n",
              " 'banks',\n",
              " 'chinas',\n",
              " 'expand',\n",
              " 'plc',\n",
              " 'international',\n",
              " 'inflation',\n",
              " 'euro',\n",
              " 'global',\n",
              " 'economy stockmarket',\n",
              " 'celsius',\n",
              " 'amid',\n",
              " 'expectations',\n",
              " 'sector',\n",
              " 'bank',\n",
              " 'drop',\n",
              " 'house',\n",
              " 'results earnings',\n",
              " 'business',\n",
              " 'fund',\n",
              " 'saudi',\n",
              " 'offering',\n",
              " 'ecb',\n",
              " 'study',\n",
              " 'launch',\n",
              " 'heres',\n",
              " 'mln',\n",
              " 'expands',\n",
              " 'uk',\n",
              " 'central bank',\n",
              " 'cpi',\n",
              " 'rising',\n",
              " 'twentytwo',\n",
              " 'markets',\n",
              " 'data',\n",
              " 'microsoft',\n",
              " 'make',\n",
              " 'prices',\n",
              " 'funding',\n",
              " 'industry',\n",
              " 'use',\n",
              " 'officer',\n",
              " 'company',\n",
              " 'twitter',\n",
              " 'sampp',\n",
              " 'bond',\n",
              " 'pharma',\n",
              " 'lower',\n",
              " 'pre',\n",
              " 'security',\n",
              " 'sales',\n",
              " 'strategic',\n",
              " 'regulatory announcement',\n",
              " 'climate',\n",
              " 'eightthree',\n",
              " 'vs',\n",
              " 'research',\n",
              " 'street',\n",
              " 'service',\n",
              " 'record',\n",
              " 'dollar',\n",
              " 'media',\n",
              " 'prime',\n",
              " 'earnings transcript',\n",
              " 'group',\n",
              " 'point',\n",
              " 'growth',\n",
              " 'end',\n",
              " 'likely',\n",
              " 'announce',\n",
              " 'test',\n",
              " 'build',\n",
              " 'index',\n",
              " 'higher',\n",
              " 'general',\n",
              " 'ford',\n",
              " 'american',\n",
              " 'consumer',\n",
              " 'year',\n",
              " 'economic',\n",
              " 'guidance',\n",
              " 'qtwo',\n",
              " 'products',\n",
              " 'say',\n",
              " 'york',\n",
              " 'companys',\n",
              " 'twentytwo financial',\n",
              " 'cut',\n",
              " 'amazon',\n",
              " 'gaap eps',\n",
              " 'profit',\n",
              " 'est',\n",
              " 'digital',\n",
              " 'financial',\n",
              " 'hit',\n",
              " 'home',\n",
              " 'firm',\n",
              " 'launches',\n",
              " 'trading',\n",
              " 'form eightthree',\n",
              " 'pay',\n",
              " 'amp',\n",
              " 'partnership',\n",
              " 'provide',\n",
              " 'project',\n",
              " 'investors',\n",
              " 'stocks',\n",
              " 'cost',\n",
              " 'systems',\n",
              " 'new york',\n",
              " 'investing',\n",
              " 'unit',\n",
              " 'second quarter',\n",
              " 'stock',\n",
              " 'latest',\n",
              " 'market',\n",
              " 'oil',\n",
              " 'jobs',\n",
              " 'quarter',\n",
              " 'public',\n",
              " 'supply',\n",
              " 'customers',\n",
              " 'bitcoin',\n",
              " 'hike',\n",
              " 'power',\n",
              " 'trade',\n",
              " 'twentythree',\n",
              " 'regulatory',\n",
              " 'partner',\n",
              " 'reports',\n",
              " 'revenue',\n",
              " 'transcript',\n",
              " 'central',\n",
              " 'expansion',\n",
              " 'raises',\n",
              " 'announces',\n",
              " 'funds',\n",
              " 'agreement',\n",
              " 'net',\n",
              " 'crisis',\n",
              " 'early',\n",
              " 'asset',\n",
              " 'software',\n",
              " 'receives',\n",
              " 'capital',\n",
              " 'spy',\n",
              " 'months',\n",
              " 'tech',\n",
              " 'twentytwo results',\n",
              " 'clinical',\n",
              " 'uk regulatory',\n",
              " 'ukraine',\n",
              " 'investment',\n",
              " 'gas',\n",
              " 'hiring',\n",
              " 'key',\n",
              " 'biden',\n",
              " 'phase',\n",
              " 'outlook',\n",
              " 'ev',\n",
              " 'thousand twentytwo',\n",
              " 'order',\n",
              " 'week',\n",
              " 'announcement form',\n",
              " 'appoints',\n",
              " 'biggest',\n",
              " 'state',\n",
              " 'prime minister',\n",
              " 'russia',\n",
              " 'contract',\n",
              " 'acquisition',\n",
              " 'announcement',\n",
              " 'form',\n",
              " 'gains',\n",
              " 'plan',\n",
              " 'holdings',\n",
              " 'expected',\n",
              " 'offer',\n",
              " 'today',\n",
              " 'rate',\n",
              " 'elon',\n",
              " 'economy',\n",
              " 'rates',\n",
              " 'crypto',\n",
              " 'minister',\n",
              " 'policy',\n",
              " 'value',\n",
              " 'drug',\n",
              " 'share',\n",
              " 'spx',\n",
              " 'therapeutics',\n",
              " 'employees',\n",
              " 'network',\n",
              " 'bankruptcy',\n",
              " 'program',\n",
              " 'big',\n",
              " 'releases',\n",
              " 'help',\n",
              " 'air',\n",
              " 'strong',\n",
              " 'points',\n",
              " 'july',\n",
              " 'time',\n",
              " 'electric',\n",
              " 'cancer',\n",
              " 'deal',\n",
              " 'potential',\n",
              " 'files',\n",
              " 'federal',\n",
              " 'production',\n",
              " 'crude',\n",
              " 'chief',\n",
              " 'debt',\n",
              " 'partners',\n",
              " 'ceo',\n",
              " 'open',\n",
              " 'losses',\n",
              " 'boeing',\n",
              " 'price',\n",
              " 'gaap',\n",
              " 'wall',\n",
              " 'trial',\n",
              " 'close',\n",
              " 'chinese',\n",
              " 'trust',\n",
              " 'beats',\n",
              " 'apple',\n",
              " 'demand',\n",
              " 'vehicle',\n",
              " 'set',\n",
              " 'center',\n",
              " 'risk',\n",
              " 'high',\n",
              " 'credit']"
            ]
          },
          "execution_count": 24,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "chosen_features"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "vhgOQmjkBq78"
      },
      "source": [
        "### Leave only chosen features:\n",
        "Now that we deduced which features are \"important\" based on the train set, we select them for both the train set and the test set.  "
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 25,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 273
        },
        "id": "w4bY4gmWBrGe",
        "outputId": "570e8367-a9d4-40a9-8017-564077ca44ba"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "dataframe",
              "variable_name": "dataset_feat_eng_train_selected"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-f42b6338-b2d4-47b9-a7c0-608a2c9b4482\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>european</th>\n",
              "      <th>musk</th>\n",
              "      <th>day</th>\n",
              "      <th>technology</th>\n",
              "      <th>model</th>\n",
              "      <th>qtwo thousand</th>\n",
              "      <th>according</th>\n",
              "      <th>quarterly</th>\n",
              "      <th>stockmarket</th>\n",
              "      <th>fda</th>\n",
              "      <th>...</th>\n",
              "      <th>beats</th>\n",
              "      <th>apple</th>\n",
              "      <th>demand</th>\n",
              "      <th>vehicle</th>\n",
              "      <th>set</th>\n",
              "      <th>center</th>\n",
              "      <th>risk</th>\n",
              "      <th>high</th>\n",
              "      <th>credit</th>\n",
              "      <th>_label_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 325 columns</p>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-f42b6338-b2d4-47b9-a7c0-608a2c9b4482')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-f42b6338-b2d4-47b9-a7c0-608a2c9b4482 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-f42b6338-b2d4-47b9-a7c0-608a2c9b4482');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-0130e453-af9f-44db-b153-2e1e0bfd590d\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-0130e453-af9f-44db-b153-2e1e0bfd590d')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-0130e453-af9f-44db-b153-2e1e0bfd590d button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "   european  musk  day  technology  model  qtwo thousand  according  \\\n",
              "0         0     0    0           0      0              0          0   \n",
              "1         0     0    0           0      0              0          0   \n",
              "2         0     0    0           0      0              0          0   \n",
              "3         0     0    0           0      0              0          0   \n",
              "4         0     0    0           0      0              0          0   \n",
              "\n",
              "   quarterly  stockmarket  fda  ...  beats  apple  demand  vehicle  set  \\\n",
              "0          0            0    0  ...      0      1       0        0    0   \n",
              "1          0            0    0  ...      0      0       0        0    0   \n",
              "2          0            0    0  ...      0      0       0        0    0   \n",
              "3          0            0    0  ...      0      0       0        0    0   \n",
              "4          0            0    0  ...      0      0       0        0    1   \n",
              "\n",
              "   center  risk  high  credit  _label_  \n",
              "0       0     0     0       0        0  \n",
              "1       0     0     0       0        0  \n",
              "2       0     0     0       0        0  \n",
              "3       0     0     0       0        0  \n",
              "4       0     0     0       0        0  \n",
              "\n",
              "[5 rows x 325 columns]"
            ]
          },
          "execution_count": 25,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_feat_eng_train_selected = dataset_feat_eng_train.filter(chosen_features + [\"_label_\"])\n",
        "dataset_feat_eng_test_selected = dataset_feat_eng_test.filter(chosen_features + [\"_label_\"])\n",
        "\n",
        "dataset_feat_eng_train_selected.head()"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 26,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "5jiW-llRT4VK",
        "outputId": "8eb4d490-c97e-41d8-be5b-4e575b74970b"
      },
      "outputs": [
        {
          "data": {
            "text/plain": [
              "_label_\n",
              "0    13352\n",
              "1     3534\n",
              "Name: count, dtype: int64"
            ]
          },
          "execution_count": 26,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_feat_eng_train_selected[\"_label_\"].value_counts()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "4JAQspmJCwfT"
      },
      "source": [
        "# Machine Learning   \n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 27,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 273
        },
        "id": "QIVm-heY6AdA",
        "outputId": "add29569-0ed0-41cd-eae6-481a1842554c"
      },
      "outputs": [
        {
          "data": {
            "application/vnd.google.colaboratory.intrinsic+json": {
              "type": "dataframe",
              "variable_name": "dataset_feat_eng_train_selected"
            },
            "text/html": [
              "\n",
              "  <div id=\"df-346d0e66-d350-498a-b558-b74424b38de5\" class=\"colab-df-container\">\n",
              "    <div>\n",
              "<style scoped>\n",
              "    .dataframe tbody tr th:only-of-type {\n",
              "        vertical-align: middle;\n",
              "    }\n",
              "\n",
              "    .dataframe tbody tr th {\n",
              "        vertical-align: top;\n",
              "    }\n",
              "\n",
              "    .dataframe thead th {\n",
              "        text-align: right;\n",
              "    }\n",
              "</style>\n",
              "<table border=\"1\" class=\"dataframe\">\n",
              "  <thead>\n",
              "    <tr style=\"text-align: right;\">\n",
              "      <th></th>\n",
              "      <th>european</th>\n",
              "      <th>musk</th>\n",
              "      <th>day</th>\n",
              "      <th>technology</th>\n",
              "      <th>model</th>\n",
              "      <th>qtwo thousand</th>\n",
              "      <th>according</th>\n",
              "      <th>quarterly</th>\n",
              "      <th>stockmarket</th>\n",
              "      <th>fda</th>\n",
              "      <th>...</th>\n",
              "      <th>beats</th>\n",
              "      <th>apple</th>\n",
              "      <th>demand</th>\n",
              "      <th>vehicle</th>\n",
              "      <th>set</th>\n",
              "      <th>center</th>\n",
              "      <th>risk</th>\n",
              "      <th>high</th>\n",
              "      <th>credit</th>\n",
              "      <th>_label_</th>\n",
              "    </tr>\n",
              "  </thead>\n",
              "  <tbody>\n",
              "    <tr>\n",
              "      <th>0</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>1</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>2</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>3</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "    <tr>\n",
              "      <th>4</th>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>...</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>1</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "      <td>0</td>\n",
              "    </tr>\n",
              "  </tbody>\n",
              "</table>\n",
              "<p>5 rows × 325 columns</p>\n",
              "</div>\n",
              "    <div class=\"colab-df-buttons\">\n",
              "\n",
              "  <div class=\"colab-df-container\">\n",
              "    <button class=\"colab-df-convert\" onclick=\"convertToInteractive('df-346d0e66-d350-498a-b558-b74424b38de5')\"\n",
              "            title=\"Convert this dataframe to an interactive table.\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "  <svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\" viewBox=\"0 -960 960 960\">\n",
              "    <path d=\"M120-120v-720h720v720H120Zm60-500h600v-160H180v160Zm220 220h160v-160H400v160Zm0 220h160v-160H400v160ZM180-400h160v-160H180v160Zm440 0h160v-160H620v160ZM180-180h160v-160H180v160Zm440 0h160v-160H620v160Z\"/>\n",
              "  </svg>\n",
              "    </button>\n",
              "\n",
              "  <style>\n",
              "    .colab-df-container {\n",
              "      display:flex;\n",
              "      gap: 12px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert {\n",
              "      background-color: #E8F0FE;\n",
              "      border: none;\n",
              "      border-radius: 50%;\n",
              "      cursor: pointer;\n",
              "      display: none;\n",
              "      fill: #1967D2;\n",
              "      height: 32px;\n",
              "      padding: 0 0 0 0;\n",
              "      width: 32px;\n",
              "    }\n",
              "\n",
              "    .colab-df-convert:hover {\n",
              "      background-color: #E2EBFA;\n",
              "      box-shadow: 0px 1px 2px rgba(60, 64, 67, 0.3), 0px 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "      fill: #174EA6;\n",
              "    }\n",
              "\n",
              "    .colab-df-buttons div {\n",
              "      margin-bottom: 4px;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert {\n",
              "      background-color: #3B4455;\n",
              "      fill: #D2E3FC;\n",
              "    }\n",
              "\n",
              "    [theme=dark] .colab-df-convert:hover {\n",
              "      background-color: #434B5C;\n",
              "      box-shadow: 0px 1px 3px 1px rgba(0, 0, 0, 0.15);\n",
              "      filter: drop-shadow(0px 1px 2px rgba(0, 0, 0, 0.3));\n",
              "      fill: #FFFFFF;\n",
              "    }\n",
              "  </style>\n",
              "\n",
              "    <script>\n",
              "      const buttonEl =\n",
              "        document.querySelector('#df-346d0e66-d350-498a-b558-b74424b38de5 button.colab-df-convert');\n",
              "      buttonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "\n",
              "      async function convertToInteractive(key) {\n",
              "        const element = document.querySelector('#df-346d0e66-d350-498a-b558-b74424b38de5');\n",
              "        const dataTable =\n",
              "          await google.colab.kernel.invokeFunction('convertToInteractive',\n",
              "                                                    [key], {});\n",
              "        if (!dataTable) return;\n",
              "\n",
              "        const docLinkHtml = 'Like what you see? Visit the ' +\n",
              "          '<a target=\"_blank\" href=https://colab.research.google.com/notebooks/data_table.ipynb>data table notebook</a>'\n",
              "          + ' to learn more about interactive tables.';\n",
              "        element.innerHTML = '';\n",
              "        dataTable['output_type'] = 'display_data';\n",
              "        await google.colab.output.renderOutput(dataTable, element);\n",
              "        const docLink = document.createElement('div');\n",
              "        docLink.innerHTML = docLinkHtml;\n",
              "        element.appendChild(docLink);\n",
              "      }\n",
              "    </script>\n",
              "  </div>\n",
              "\n",
              "\n",
              "<div id=\"df-c45bd686-565b-4c65-b3bb-6d0a8e507696\">\n",
              "  <button class=\"colab-df-quickchart\" onclick=\"quickchart('df-c45bd686-565b-4c65-b3bb-6d0a8e507696')\"\n",
              "            title=\"Suggest charts\"\n",
              "            style=\"display:none;\">\n",
              "\n",
              "<svg xmlns=\"http://www.w3.org/2000/svg\" height=\"24px\"viewBox=\"0 0 24 24\"\n",
              "     width=\"24px\">\n",
              "    <g>\n",
              "        <path d=\"M19 3H5c-1.1 0-2 .9-2 2v14c0 1.1.9 2 2 2h14c1.1 0 2-.9 2-2V5c0-1.1-.9-2-2-2zM9 17H7v-7h2v7zm4 0h-2V7h2v10zm4 0h-2v-4h2v4z\"/>\n",
              "    </g>\n",
              "</svg>\n",
              "  </button>\n",
              "\n",
              "<style>\n",
              "  .colab-df-quickchart {\n",
              "      --bg-color: #E8F0FE;\n",
              "      --fill-color: #1967D2;\n",
              "      --hover-bg-color: #E2EBFA;\n",
              "      --hover-fill-color: #174EA6;\n",
              "      --disabled-fill-color: #AAA;\n",
              "      --disabled-bg-color: #DDD;\n",
              "  }\n",
              "\n",
              "  [theme=dark] .colab-df-quickchart {\n",
              "      --bg-color: #3B4455;\n",
              "      --fill-color: #D2E3FC;\n",
              "      --hover-bg-color: #434B5C;\n",
              "      --hover-fill-color: #FFFFFF;\n",
              "      --disabled-bg-color: #3B4455;\n",
              "      --disabled-fill-color: #666;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart {\n",
              "    background-color: var(--bg-color);\n",
              "    border: none;\n",
              "    border-radius: 50%;\n",
              "    cursor: pointer;\n",
              "    display: none;\n",
              "    fill: var(--fill-color);\n",
              "    height: 32px;\n",
              "    padding: 0;\n",
              "    width: 32px;\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart:hover {\n",
              "    background-color: var(--hover-bg-color);\n",
              "    box-shadow: 0 1px 2px rgba(60, 64, 67, 0.3), 0 1px 3px 1px rgba(60, 64, 67, 0.15);\n",
              "    fill: var(--button-hover-fill-color);\n",
              "  }\n",
              "\n",
              "  .colab-df-quickchart-complete:disabled,\n",
              "  .colab-df-quickchart-complete:disabled:hover {\n",
              "    background-color: var(--disabled-bg-color);\n",
              "    fill: var(--disabled-fill-color);\n",
              "    box-shadow: none;\n",
              "  }\n",
              "\n",
              "  .colab-df-spinner {\n",
              "    border: 2px solid var(--fill-color);\n",
              "    border-color: transparent;\n",
              "    border-bottom-color: var(--fill-color);\n",
              "    animation:\n",
              "      spin 1s steps(1) infinite;\n",
              "  }\n",
              "\n",
              "  @keyframes spin {\n",
              "    0% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "      border-left-color: var(--fill-color);\n",
              "    }\n",
              "    20% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    30% {\n",
              "      border-color: transparent;\n",
              "      border-left-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    40% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-top-color: var(--fill-color);\n",
              "    }\n",
              "    60% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "    }\n",
              "    80% {\n",
              "      border-color: transparent;\n",
              "      border-right-color: var(--fill-color);\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "    90% {\n",
              "      border-color: transparent;\n",
              "      border-bottom-color: var(--fill-color);\n",
              "    }\n",
              "  }\n",
              "</style>\n",
              "\n",
              "  <script>\n",
              "    async function quickchart(key) {\n",
              "      const quickchartButtonEl =\n",
              "        document.querySelector('#' + key + ' button');\n",
              "      quickchartButtonEl.disabled = true;  // To prevent multiple clicks.\n",
              "      quickchartButtonEl.classList.add('colab-df-spinner');\n",
              "      try {\n",
              "        const charts = await google.colab.kernel.invokeFunction(\n",
              "            'suggestCharts', [key], {});\n",
              "      } catch (error) {\n",
              "        console.error('Error during call to suggestCharts:', error);\n",
              "      }\n",
              "      quickchartButtonEl.classList.remove('colab-df-spinner');\n",
              "      quickchartButtonEl.classList.add('colab-df-quickchart-complete');\n",
              "    }\n",
              "    (() => {\n",
              "      let quickchartButtonEl =\n",
              "        document.querySelector('#df-c45bd686-565b-4c65-b3bb-6d0a8e507696 button');\n",
              "      quickchartButtonEl.style.display =\n",
              "        google.colab.kernel.accessAllowed ? 'block' : 'none';\n",
              "    })();\n",
              "  </script>\n",
              "</div>\n",
              "\n",
              "    </div>\n",
              "  </div>\n"
            ],
            "text/plain": [
              "   european  musk  day  technology  model  qtwo thousand  according  \\\n",
              "0         0     0    0           0      0              0          0   \n",
              "1         0     0    0           0      0              0          0   \n",
              "2         0     0    0           0      0              0          0   \n",
              "3         0     0    0           0      0              0          0   \n",
              "4         0     0    0           0      0              0          0   \n",
              "\n",
              "   quarterly  stockmarket  fda  ...  beats  apple  demand  vehicle  set  \\\n",
              "0          0            0    0  ...      0      1       0        0    0   \n",
              "1          0            0    0  ...      0      0       0        0    0   \n",
              "2          0            0    0  ...      0      0       0        0    0   \n",
              "3          0            0    0  ...      0      0       0        0    0   \n",
              "4          0            0    0  ...      0      0       0        0    1   \n",
              "\n",
              "   center  risk  high  credit  _label_  \n",
              "0       0     0     0       0        0  \n",
              "1       0     0     0       0        0  \n",
              "2       0     0     0       0        0  \n",
              "3       0     0     0       0        0  \n",
              "4       0     0     0       0        0  \n",
              "\n",
              "[5 rows x 325 columns]"
            ]
          },
          "execution_count": 27,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "dataset_feat_eng_train_selected.head()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "ffIH343jLKRs"
      },
      "source": [
        "Parse out the Y labels from the dataset, and change the variable type to suit the models."
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 28,
      "metadata": {
        "id": "ox0X7mDgLKg1"
      },
      "outputs": [],
      "source": [
        "x_features_train = dataset_feat_eng_train_selected.values[:, 0:-1]\n",
        "y_labels_train = dataset_feat_eng_train_selected.values[:, -1]\n",
        "\n",
        "x_features_test = dataset_feat_eng_test_selected.values[:, :-1]\n",
        "y_labels_test = dataset_feat_eng_test_selected.values[:, -1]"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "SDAccMSXDD7N"
      },
      "source": [
        "#### Iterate over ML models"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 29,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "lJyaQVzyDETH",
        "outputId": "c8c4ea95-d6f1-49cf-e369-f9003f455e6a"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Random Forest: mean(accuracy)=0.837, std(accuracy)=0.033\n",
            "LASSO: mean(accuracy)=0.847, std(accuracy)=0.02\n",
            "KNN: mean(accuracy)=0.81, std(accuracy)=0.038\n",
            "Decision Tree: mean(accuracy)=0.817, std(accuracy)=0.029\n",
            "SVM: mean(accuracy)=0.796, std(accuracy)=0.005\n",
            "\n",
            "Best model is:\n",
            "LASSO\n"
          ]
        }
      ],
      "source": [
        "models = []\n",
        "models.append((\"Random Forest\", RandomForestClassifier(random_state=config_dict['seed'])))\n",
        "models.append((\"LASSO\", lm.LogisticRegression(solver='liblinear', penalty='l1', max_iter=1000, random_state=config_dict['seed'])))\n",
        "models.append((\"KNN\", KNeighborsClassifier()))\n",
        "models.append((\"Decision Tree\", DecisionTreeClassifier(random_state=config_dict['seed'])))\n",
        "models.append((\"SVM\", SVC(gamma='auto', random_state=config_dict['seed'])))\n",
        "\n",
        "results = []\n",
        "names = []\n",
        "best_mean_result = 0\n",
        "best_std_result = 0\n",
        "for name, model in models:\n",
        "  kfold = StratifiedKFold()\n",
        "  cv_results = cross_val_score(model, x_features_train, y_labels_train, scoring='accuracy', cv=kfold)\n",
        "  results.append(cv_results)\n",
        "  names.append(name)\n",
        "  print(name + \": mean(accuracy)=\" + str(round(np.mean(cv_results), 3)) + \", std(accuracy)=\" + str(round(np.std(cv_results), 3)))\n",
        "  if (best_mean_result < np.mean(cv_results)) or \\\n",
        "    ((best_mean_result == np.mean(cv_results)) and (best_std_result > np.std(cv_results))):\n",
        "    best_mean_result = np.mean(cv_results)\n",
        "    best_std_result = np.std(cv_results)\n",
        "    best_model_name = name\n",
        "    best_model = model\n",
        "print(\"\\nBest model is:\\n\" + best_model_name)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "5HUN7jJE1IcP"
      },
      "source": [
        "Observe the distribution of the results across the validation folds:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 30,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 452
        },
        "id": "sx1tB-Vw1IuI",
        "outputId": "eeec327c-b67c-409b-b4bc-6888a6efc076"
      },
      "outputs": [
        {
          "data": {
            "image/png": "iVBORw0KGgoAAAANSUhEUgAAAiwAAAGzCAYAAAAMr0ziAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjcuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/bCgiHAAAACXBIWXMAAA9hAAAPYQGoP6dpAABUYklEQVR4nO3dd1gU1/4/8PcC0sECChZkjYpgCQgKEYyVBKNiiV0Q7JqIDeNXjCJRrxJTCIlRMV4sN+jVaNQYOyGisWIWS4wUSxCjAhIVpKrs+f2RH3tdKbIrZSDv1/Pso8yeOfOZ3WV575kzszIhhAARERGRhOnUdAFEREREL8PAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCkiOTyfDRRx9pvF5KSgpkMhk2b95c6TVJkVwux/jx42u6DABAbGwsZDIZYmNjVcvGjx8PuVxeLdt/8bHYvHkzZDIZfv3112rZfq9evdCrV69q2Za20tPTMXz4cFhYWEAmkyE8PLymSyLSCAMLlar4DV8mk+HkyZMl7hdCwMbGBjKZDAMHDqyBCitPr169JPOH/1VcvXoVH330EVJSUrRaXwph71X3oSpJubaKmDt3Lo4cOYKFCxfi22+/Rb9+/Wq6JCKN6NV0ASRthoaG2LZtG7p37662/Pjx4/jzzz9hYGBQQ5XRi65evYqlS5eiV69e1TayUZ4NGzZAqVRqtI62+5CUlAQdnar9/FVebUePHq3SbVeGn3/+GYMHD8YHH3xQ06UQaYUjLFSu/v37Y+fOnXj27Jna8m3btsHFxQXW1tY1VFnNUyqVKCgoqOkyJKtevXpVGmiFEMjPzwcAGBgYoF69elW2rZfR19eHvr5+jW2/IjIyMtCgQYOaLqPSFRQUaByMqXZiYKFyjRkzBn/99Reio6NVy548eYJdu3Zh7Nixpa6Tm5uLefPmwcbGBgYGBmjXrh0+++wzvPjF4IWFhZg7dy4aN24MMzMzDBo0CH/++Wepfd65cwcTJ06ElZUVDAwM0KFDB2zcuPGl9aelpWHChAlo0aIFDAwM0LRpUwwePFirYX2ZTIaAgABs3boVHTp0gIGBAQ4fPqxRfatXr0aHDh1gbGyMhg0bokuXLti2bZvq/rLmfXz00UeQyWRl1rZ582aMGDECANC7d2/V4bziOSW//vorvLy8YGlpCSMjI7Rq1QoTJ07U+DEAgD///BNDhgyBiYkJmjRpgrlz56KwsLBEu9L2Zfv27XBxcYGZmRnMzc3RqVMnfPnllxXaB7lcjoEDB+LIkSPo0qULjIyMsH79etV9pR3Wy8vLw7Rp02BhYQFzc3P4+fnh4cOHam3KmjP1fJ8vq620OSwZGRmYNGkSrKysYGhoCEdHR2zZskWtTfG8q88++wzffPMNWrduDQMDA3Tt2hXnz58vUVNpbt68iREjRqBRo0YwNjbGG2+8gQMHDqjuLz68K4TAmjVrVLWX57PPPoO7uzssLCxgZGQEFxcX7Nq1q9S2UVFRcHV1Vb2me/ToUWLE6dChQ+jZs6fqee/atava676s5+/Fx7V4rtT27duxePFiNG/eHMbGxsjOzsaDBw/wwQcfoFOnTjA1NYW5uTneeecdXLp0qUS/BQUF+Oijj2BnZwdDQ0M0bdoU7777Lm7cuAEhBORyOQYPHlzqevXr18e0adPKffyoavCQEJVLLpejW7du+O9//4t33nkHwN9vPllZWRg9ejS++uortfZCCAwaNAjHjh3DpEmT4OTkhCNHjmD+/Pm4c+cOvvjiC1XbyZMnIyoqCmPHjoW7uzt+/vlnDBgwoEQN6enpeOONN1SBoXHjxjh06BAmTZqE7OxszJkzp8z6hw0bht9//x0zZ86EXC5HRkYGoqOjkZqaqtVhk59//hnfffcdAgICYGlpCblcXuH6NmzYgFmzZmH48OGYPXs2CgoKcPnyZZw7d67M8FdRPXr0wKxZs/DVV1/hww8/hIODAwDAwcEBGRkZePvtt9G4cWMEBQWhQYMGSElJwe7duzXeTn5+Pvr27YvU1FTMmjULzZo1w7fffouff/75petGR0djzJgx6Nu3L1atWgUASEhIwKlTpzB79uxy96FYUlISxowZg2nTpmHKlClo165dudsMCAhAgwYN8NFHHyEpKQnr1q3DrVu3VH/4KqoitT0vPz8fvXr1wvXr1xEQEIBWrVph586dGD9+PB49eoTZs2ertd+2bRseP36MadOmQSaT4ZNPPsG7776LmzdvljtylJ6eDnd3d+Tl5WHWrFmwsLDAli1bMGjQIOzatQtDhw5Fjx498O2332LcuHF466234Ofn99L9/fLLLzFo0CD4+PjgyZMn2L59O0aMGIH9+/er/Y4uXboUH330Edzd3bFs2TLo6+vj3Llz+Pnnn/H2228D+DswTZw4ER06dMDChQvRoEEDXLhwAYcPH9b6db98+XLo6+vjgw8+QGFhIfT19XH16lXs3bsXI0aMQKtWrZCeno7169ejZ8+euHr1Kpo1awYAKCoqwsCBAxETE4PRo0dj9uzZePz4MaKjo3HlyhW0bt0avr6++OSTT/DgwQM0atRItd0ff/wR2dnZ8PX11apuekWCqBSbNm0SAMT58+fF119/LczMzEReXp4QQogRI0aI3r17CyGEsLW1FQMGDFCtt3fvXgFA/Otf/1Lrb/jw4UImk4nr168LIYS4ePGiACDef/99tXZjx44VAERISIhq2aRJk0TTpk1FZmamWtvRo0eL+vXrq+r6448/BACxadMmIYQQDx8+FADEp59++uoPiBACgNDR0RG///672vKK1jd48GDRoUOHcrfh7+8vbG1tSywPCQkRL/662traCn9/f9XPO3fuFADEsWPH1Nrt2bNH9Vy+qvDwcAFAfPfdd6plubm5ok2bNiW2/eK+zJ49W5ibm4tnz56V2X9Z+yDE3/sLQBw+fLjU+55/LIpfvy4uLuLJkyeq5Z988okAIH744QfVshdfb2X1WV5tPXv2FD179lT9XPw4RUVFqZY9efJEdOvWTZiamors7GwhxP9esxYWFuLBgweqtj/88IMAIH788ccS23renDlzBADxyy+/qJY9fvxYtGrVSsjlclFUVKS2nzNmzCi3v2LFr9nna+/YsaPo06ePatm1a9eEjo6OGDp0qNp2hBBCqVQKIYR49OiRMDMzE25ubiI/P7/UNkKUfKyLvfi4Hjt2TAAQr732WokaCwoKStTxxx9/CAMDA7Fs2TLVso0bNwoAIiwsrMT2imtKSkoSAMS6devU7h80aJCQy+VqtVP14SEheqmRI0ciPz8f+/fvx+PHj7F///4yPxkdPHgQurq6mDVrltryefPmQQiBQ4cOqdoBKNHuxdESIQS+//57eHt7QwiBzMxM1c3LywtZWVmIj48vtRYjIyPo6+sjNja2xGEAbfXs2RPt27fXqr4GDRrgzz//rPBQf2Upnrewf/9+PH369JX6OnjwIJo2bYrhw4erlhkbG2Pq1KkVqiM3N1ft8KKmWrVqBS8vrwq3nzp1qtoIxXvvvQc9PT3V66+qHDx4ENbW1hgzZoxqWb169TBr1izk5OTg+PHjau1HjRqFhg0bqn5+8803Afx9uOdl23F1dVWbFG9qaoqpU6ciJSUFV69e1ap+IyMj1f8fPnyIrKwsvPnmm2q/a3v37oVSqcSSJUtKTHguHr2Kjo7G48ePERQUBENDw1LbaMPf31+tRuDveUzFdRQVFeGvv/6Cqakp2rVrp1b3999/D0tLS8ycObNEv8U12dnZwc3NDVu3blXd9+DBAxw6dAg+Pj6vVDtpj4GFXqpx48bw9PTEtm3bsHv3bhQVFan9wXrerVu30KxZM5iZmaktLx46v3XrlupfHR0dtG7dWq3di0P89+/fx6NHj/DNN9+gcePGarcJEyYA+HuuQGkMDAywatUqHDp0CFZWVujRowc++eQTpKWlaf4g/H+tWrXSur4FCxbA1NQUrq6uaNu2LWbMmIFTp05pXUtF9ezZE8OGDcPSpUthaWmJwYMHY9OmTaXOO3mZW7duoU2bNiXesF92aAYA3n//fdjZ2eGdd95BixYtMHHiRNUcoIp68fF/mbZt26r9bGpqiqZNm1b5qcm3bt1C27ZtS/whf/H3oFjLli3Vfi4OLy8L2rdu3Sr1sS9rOxW1f/9+vPHGGzA0NESjRo3QuHFjrFu3DllZWao2N27cgI6OjlqAf9GNGzcAAB07dtSqjrKU9jpQKpX44osv0LZtWxgYGMDS0hKNGzfG5cuXS9Tdrl076OmVPyPCz88Pp06dUj2GO3fuxNOnTzFu3LhK3ReqOAYWqpCxY8fi0KFDiIiIwDvvvFNtZxsUz/739fVFdHR0qTcPD48y158zZw6Sk5MRGhoKQ0NDBAcHw8HBARcuXNCqnhc/1WlSn4ODA5KSkrB9+3Z0794d33//Pbp3746QkBBVf2V9cisqKtKq3uI+d+3ahTNnziAgIEA1QdjFxQU5OTla96upJk2a4OLFi9i3b59qntM777wDf3//Cvfx4uNflV7lMdeUrq5uqcvFCxPVq8Mvv/yCQYMGwdDQEGvXrsXBgwcRHR2NsWPHVlk9mr7uS3sdrFy5EoGBgejRoweioqJw5MgRREdHo0OHDlqdRTR69GjUq1dPNcoSFRWFLl26VCicU9XgpFuqkKFDh2LatGk4e/YsduzYUWY7W1tb/PTTT3j8+LHaKEtiYqLq/uJ/lUql6tNOsaSkJLX+is8gKioqgqenp1a1t27dGvPmzcO8efNw7do1ODk54fPPP0dUVJRW/b1KfSYmJhg1ahRGjRqFJ0+e4N1338WKFSuwcOFCGBoaomHDhnj06FGJ9SrySfllw9RvvPEG3njjDaxYsQLbtm2Dj48Ptm/fjsmTJ7+072K2tra4cuUKhBBq23vxeSuLvr4+vL294e3tDaVSiffffx/r169HcHBwqSM3r+ratWvo3bu36uecnBzcu3cP/fv3Vy0r7TF/8uQJ7t27p7ZMk9psbW1x+fJlKJVKtVGWF38PXpWtrW2pj/2rbOf777+HoaEhjhw5onZa+qZNm9TatW7dGkqlElevXoWTk1OpfRWPoF65cgVt2rQpc5vlve5fe+21CtW9a9cu9O7dG5GRkWrLHz16BEtLS7Wazp07h6dPn5Y7oblRo0YYMGAAtm7dCh8fH5w6dYpXB65hHGGhCjE1NcW6devw0Ucfwdvbu8x2/fv3R1FREb7++mu15V988QVkMpnqTKPif188y+jFNwRdXV0MGzYM33//Pa5cuVJie/fv3y+zlry8vBLXSWndujXMzMy0OhxSGk3q++uvv9Tu09fXR/v27SGEUM0tad26NbKysnD58mVVu3v37mHPnj0vrcXExAQASrzxP3z4sMQn4+I/MJo+Dv3798fdu3fVTnHNy8vDN99889J1X9x/HR0dvP7662p1lLUP2vrmm2/U5u2sW7cOz549U73+gL8f8xMnTpRY78VP95rU1r9/f6SlpamF+2fPnmH16tUwNTVFz549tdmdUrcTFxeHM2fOqJbl5ubim2++gVwuL/dwTVl0dXUhk8nU9j8lJQV79+5VazdkyBDo6Ohg2bJlJUYwil9vb7/9NszMzBAaGlrid/H512Tr1q1x9uxZPHnyRLVs//79uH37tkZ1v/g637lzJ+7cuaO2bNiwYcjMzCzxHvViTQAwbtw4XL16FfPnz4euri5Gjx5d4Xqo8nGEhSqsIkP33t7e6N27NxYtWoSUlBQ4Ojri6NGj+OGHHzBnzhzVJy4nJyeMGTMGa9euRVZWFtzd3RETE4Pr16+X6PPjjz/GsWPH4ObmhilTpqB9+/Z48OAB4uPj8dNPP+HBgwel1pKcnIy+ffti5MiRaN++PfT09LBnzx6kp6dX6htPRet7++23YW1tDQ8PD1hZWSEhIQFff/01BgwYoBqNGj16NBYsWIChQ4di1qxZyMvLw7p162BnZ1fm5OJiTk5O0NXVxapVq5CVlQUDAwP06dMH27Ztw9q1azF06FC0bt0ajx8/xoYNG2Bubq420lARU6ZMwddffw0/Pz8oFAo0bdoU3377LYyNjV+67uTJk/HgwQP06dMHLVq0wK1bt7B69Wo4OTmp5lyUtQ9NmjTRqM5iT548Ub0GkpKSsHbtWnTv3h2DBg1Sq2v69OkYNmwY3nrrLVy6dAlHjhxR+1SuaW1Tp07F+vXrMX78eCgUCsjlcuzatUv1Kf3FOV7aCgoKUl1yYNasWWjUqBG2bNmCP/74A99//71WV/8dMGAAwsLC0K9fP4wdOxYZGRlYs2YN2rRpoxak27Rpg0WLFmH58uV488038e6778LAwADnz59Hs2bNEBoaCnNzc3zxxReYPHkyunbtirFjx6Jhw4a4dOkS8vLyVNelmTx5Mnbt2oV+/fph5MiRuHHjBqKiokrMcSvPwIEDsWzZMkyYMAHu7u747bffsHXr1hIjNH5+fvjPf/6DwMBAxMXF4c0330Rubi5++uknvP/++2rXXxkwYAAsLCywc+dOvPPOO1q/DqmS1MSpSSR9z5/WXJ4XT2sW4u/TKufOnSuaNWsm6tWrJ9q2bSs+/fTTEqcC5ufni1mzZgkLCwthYmIivL29xe3bt0s9zTQ9PV3MmDFD2NjYiHr16glra2vRt29f8c0336javHhac2ZmppgxY4awt7cXJiYmon79+sLNzU3tlFxNoJzTQitS3/r160WPHj2EhYWFMDAwEK1btxbz588XWVlZan0dPXpUdOzYUejr64t27dqJqKioCp3WLIQQGzZsEK+99prQ1dVVnYIbHx8vxowZI1q2bCkMDAxEkyZNxMCBA8Wvv/6q1eNw69YtMWjQIGFsbCwsLS3F7NmzxeHDh196WvOuXbvE22+/LZo0aSL09fVFy5YtxbRp08S9e/deug/F+/via62sx6L49Xv8+HExdepU0bBhQ2Fqaip8fHzEX3/9pbZuUVGRWLBggbC0tBTGxsbCy8tLXL9+vcKPrxAlT78V4u/XxIQJE4SlpaXQ19cXnTp1Ur02ixW/Zks79b6034PS3LhxQwwfPlw0aNBAGBoaCldXV7F///5S+6voac2RkZGibdu2wsDAQNjb24tNmzaV+hoU4u/ThDt37iwMDAxEw4YNRc+ePUV0dLRam3379gl3d3dhZGQkzM3Nhaurq/jvf/+r1ubzzz8XzZs3FwYGBsLDw0P8+uuvZZ7WvHPnzhJ1FBQUiHnz5ommTZsKIyMj4eHhIc6cOVPqc5OXlycWLVokWrVqpfp9HT58uLhx40aJft9//30BQGzbtq1Cjx1VHZkQNTCri4iIqBaYO3cuIiMjkZaWVqGRRKo6nMNCRERUioKCAkRFRWHYsGEMKxLAOSxERETPycjIwE8//YRdu3bhr7/+KvFVClQzGFiIiIiec/XqVfj4+KBJkyb46quvyjxtm6oX57AQERGR5HEOCxEREUkeAwsRERFJXp2Zw6JUKnH37l2YmZnxmzSJiIhqCSEEHj9+jGbNmpV7scM6E1ju3r0LGxubmi6DiIiItHD79m20aNGizPvrTGApvtT17du3YW5uXsPVEBERUUVkZ2fDxsbmpV9ZUWcCS/FhIHNzcwYWIiKiWuZl0zk46ZaIiIgkj4GFiIiIJI+BhYiIiCRPq8CyZs0ayOVyGBoaws3NDXFxceW2Dw8PR7t27WBkZAQbGxvMnTsXBQUFqvuLiooQHByMVq1awcjICK1bt8by5cvBi/ASERERoMWk2x07diAwMBARERFwc3NDeHg4vLy8kJSUhCZNmpRov23bNgQFBWHjxo1wd3dHcnIyxo8fD5lMhrCwMADAqlWrsG7dOmzZsgUdOnTAr7/+igkTJqB+/fqYNWvWq+8lERER1Woaf5eQm5sbunbtiq+//hrA3xdss7GxwcyZMxEUFFSifUBAABISEhATE6NaNm/ePJw7dw4nT54EAAwcOBBWVlaIjIxUtRk2bBiMjIwQFRVVobqys7NRv359ZGVl8SwhIiKiWqKif781OiT05MkTKBQKeHp6/q8DHR14enrizJkzpa7j7u4OhUKhOmx08+ZNHDx4EP3791drExMTg+TkZADApUuXcPLkSbzzzjtl1lJYWIjs7Gy1GxEREdVNGh0SyszMRFFREaysrNSWW1lZITExsdR1xo4di8zMTHTv3h1CCDx79gzTp0/Hhx9+qGoTFBSE7Oxs2NvbQ1dXF0VFRVixYgV8fHzKrCU0NBRLly7VpHwiIiKqpar8LKHY2FisXLkSa9euRXx8PHbv3o0DBw5g+fLlqjbfffcdtm7dim3btiE+Ph5btmzBZ599hi1btpTZ78KFC5GVlaW63b59u6p3hYiIiGqIRiMslpaW0NXVRXp6utry9PR0WFtbl7pOcHAwxo0bh8mTJwMAOnXqhNzcXEydOhWLFi2Cjo4O5s+fj6CgIIwePVrV5tatWwgNDYW/v3+p/RoYGMDAwECT8omIiKiW0miERV9fHy4uLmoTaJVKJWJiYtCtW7dS18nLyyvx7Yu6uroAoDptuaw2SqVSk/KIiIiojtL4tObAwED4+/ujS5cucHV1RXh4OHJzczFhwgQAgJ+fH5o3b47Q0FAAgLe3N8LCwtC5c2e4ubnh+vXrCA4Ohre3tyq4eHt7Y8WKFWjZsiU6dOiACxcuICwsDBMnTqzEXSUiIqLaSuPAMmrUKNy/fx9LlixBWloanJyccPjwYdVE3NTUVLXRksWLF0Mmk2Hx4sW4c+cOGjdurAooxVavXo3g4GC8//77yMjIQLNmzTBt2jQsWbKkEnax5uTl5ZU5Gbk0+fn5SElJgVwuh5GRkUbbsre3h7GxsaYlEhER1QoaX4dFqqR4HZb4+Hi4uLhUy7YUCgWcnZ2rZVtERESVpaJ/vzUeYaGKs7e3h0KhqHD7hIQE+Pr6IioqCg4ODhpvi4iIqK5iYKlCxsbGWo16ODg4cLSEiIjoOfy2ZiIiIpI8BhYiIiKSPAYWIiIikjwGFiIiIpI8BhYiIiKSPAYWIiIikjwGFiIiIpI8BhYiIiKSPAYWIiIikjwGFiIiIpI8XppfA9euXcPjx4+rrP+EhAS1f6uKmZkZ2rZtW6XbICIiqkwMLBV07do12NnZVcu2fH19q3wbycnJDC1ERFRrMLBUUPHIijbfpFxR+fn5SElJgVwuh5GRUZVso/gboatypIiIiKiyMbBoqKq/SdnDw6PK+iYiIqqtOOmWiIiIJI+BhYiIiCSPh4ToHyEvLw+JiYkVbv8q84ns7e1hbGysaYlERFQOBhb6R0hMTISLi0u1bEuhUFTpPCcion8iBhb6R7C3t4dCoahw++KzqbQ5K8ze3l7T8oiI6CUYWOgfwdjYWKtRj6o+K4yIiCqGgYVqraq88jCvOkxEJC0MLFQrVdeVh3nVYSIiaWBgoVqpqq88zKsOExFJCwML1WpVOceEVx0mIpIOXjiOiIiIJI+BhYiIiCSPgYWIiIgkj4GFiIiIJI+BhYiIiCSPgYWIiIgkj4GFiIiIJI+BhYiIiCSPgYWIiIgkj4GFiIiIJI+BhYiIiCSPgYWIiIgkj4GFiIiIJI+BhYiIiCRPq8CyZs0ayOVyGBoaws3NDXFxceW2Dw8PR7t27WBkZAQbGxvMnTsXBQUFam3u3LkDX19fWFhYwMjICJ06dcKvv/6qTXlERERUx+hpusKOHTsQGBiIiIgIuLm5ITw8HF5eXkhKSkKTJk1KtN+2bRuCgoKwceNGuLu7Izk5GePHj4dMJkNYWBgA4OHDh/Dw8EDv3r1x6NAhNG7cGNeuXUPDhg1ffQ+JiIio1tM4sISFhWHKlCmYMGECACAiIgIHDhzAxo0bERQUVKL96dOn4eHhgbFjxwIA5HI5xowZg3PnzqnarFq1CjY2Nti0aZNqWatWrcqto7CwEIWFhaqfs7OzNd0VIiIiqiU0OiT05MkTKBQKeHp6/q8DHR14enrizJkzpa7j7u4OhUKhOmx08+ZNHDx4EP3791e12bdvH7p06YIRI0agSZMm6Ny5MzZs2FBuLaGhoahfv77qZmNjo8muEBERUS2iUWDJzMxEUVERrKys1JZbWVkhLS2t1HXGjh2LZcuWoXv37qhXrx5at26NXr164cMPP1S1uXnzJtatW4e2bdviyJEjeO+99zBr1ixs2bKlzFoWLlyIrKws1e327dua7AoRERHVIlV+llBsbCxWrlyJtWvXIj4+Hrt378aBAwewfPlyVRulUglnZ2esXLkSnTt3xtSpUzFlyhRERESU2a+BgQHMzc3VbkRERFQ3aTSHxdLSErq6ukhPT1dbnp6eDmtr61LXCQ4Oxrhx4zB58mQAQKdOnZCbm4upU6di0aJF0NHRQdOmTdG+fXu19RwcHPD9999rUh4RERHVURqNsOjr68PFxQUxMTGqZUqlEjExMejWrVup6+Tl5UFHR30zurq6AAAhBADAw8MDSUlJam2Sk5Nha2urSXlERERUR2l8llBgYCD8/f3RpUsXuLq6Ijw8HLm5uaqzhvz8/NC8eXOEhoYCALy9vREWFobOnTvDzc0N169fR3BwMLy9vVXBZe7cuXB3d8fKlSsxcuRIxMXF4ZtvvsE333xTibtKREREtZXGgWXUqFG4f/8+lixZgrS0NDg5OeHw4cOqibipqalqIyqLFy+GTCbD4sWLcefOHTRu3Bje3t5YsWKFqk3Xrl2xZ88eLFy4EMuWLUOrVq0QHh4OHx+fSthFIiIiqu00DiwAEBAQgICAgFLvi42NVd+Anh5CQkIQEhJSbp8DBw7EwIEDtSmHiIiI6jh+lxARERFJHgMLERERSR4DCxEREUmeVnNYiIio9svLy0NiYqJG6+Tn5yMlJQVyuRxGRkYVXs/e3h7GxsaalkikwsBCRPQPlZiYCBcXl2rZlkKhgLOzc7Vsi+omBhYion8oe3t7KBQKjdZJSEiAr68voqKi4ODgoNG2iF4FAwsR0T+UsbGx1qMeDg4OHDGhasVJt0RERCR5DCxEREQkeQwsREREJHkMLERERCR5DCxEREQkeQwsREREJHkMLERERCR5DCxEREQkeQwsREREJHkMLERERCR5DCxEREQkefwuIaqVZM8K0NlaB0aPkoG7tTN3Gz1KRmdrHcieFdR0KUREksfAQrWSYU4q4qeZAiemASdquhrtOACIn2aKhJxUAO41XQ4RkaQxsFCtVGDaEs7rc7B161Y41NKvrU9ITISPjw8i+7es6VKIiCSPgYVqJaFniAtpSuQ3sAOaOdV0OVrJT1PiQpoSQs+wpkshIpI8BhYiqlZ5eXlITEzUaJ38/HykpKRALpfDyMiowuvZ29vD2NhY0xKJSIIYWIioWiUmJsLFxaVatqVQKODs7Fwt25KKa9eu4fHjx1XWf0JCgtq/VcXMzAxt27at0m1Q7cLAQkTVyt7eHgqFQqN1EhIS4Ovri6ioKDg4OGi0rX+Sa9euwc7Orlq25evrW+XbSE5OZmghFQYWIqpWxsbGWo96ODg4/ONGTDRRPLKiabDThLaH5zRRHFCrcqSIah8GFiKiOqaqg52Hh0eV9U1Ultp5xS0iIiL6R2FgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJ0yqwrFmzBnK5HIaGhnBzc0NcXFy57cPDw9GuXTsYGRnBxsYGc+fORUFBQaltP/74Y8hkMsyZM0eb0oiIiKgO0jiw7NixA4GBgQgJCUF8fDwcHR3h5eWFjIyMUttv27YNQUFBCAkJQUJCAiIjI7Fjxw58+OGHJdqeP38e69evx+uvv675nhAREVGdpafpCmFhYZgyZQomTJgAAIiIiMCBAwewceNGBAUFlWh/+vRpeHh4YOzYsQAAuVyOMWPG4Ny5c2rtcnJy4OPjgw0bNuBf//qXNvtCRDXk2rVrePz4cZX1n5CQoPZvVTAzM0Pbtm2rrH8iejUaBZYnT55AoVBg4cKFqmU6Ojrw9PTEmTNnSl3H3d0dUVFRiIuLg6urK27evImDBw9i3Lhxau1mzJiBAQMGwNPTs0KBpbCwEIWFhaqfs7OzNdkVIqok165dg52dXbVsy9fXt0r7T05OZmghkiiNAktmZiaKiopgZWWlttzKygqJiYmlrjN27FhkZmaie/fuEELg2bNnmD59utohoe3btyM+Ph7nz5+vcC2hoaFYunSpJuUTURUoHlmJioqCg4NDlWwjPz8fKSkpkMvlMDIyqvT+ExIS4OvrW6WjRET0ajQ+JKSp2NhYrFy5EmvXroWbmxuuX7+O2bNnY/ny5QgODsbt27cxe/ZsREdHw9DQsML9Lly4EIGBgaqfs7OzYWNjUxW7QEQV4ODgAGdn5yrr38PDo8r6JiLp0yiwWFpaQldXF+np6WrL09PTYW1tXeo6wcHBGDduHCZPngwA6NSpE3JzczF16lQsWrQICoUCGRkZam90RUVFOHHiBL7++msUFhZCV1e3RL8GBgYwMDDQpHwiIiKqpTQ6S0hfXx8uLi6IiYlRLVMqlYiJiUG3bt1KXScvLw86OuqbKQ4gQgj07dsXv/32Gy5evKi6denSBT4+Prh48WKpYYWIiIj+WTQ+JBQYGAh/f3906dIFrq6uCA8PR25uruqsIT8/PzRv3hyhoaEAAG9vb4SFhaFz586qQ0LBwcHw9vaGrq4uzMzM0LFjR7VtmJiYwMLCosRyIiIi+mfSOLCMGjUK9+/fx5IlS5CWlgYnJyccPnxYNRE3NTVVbURl8eLFkMlkWLx4Me7cuYPGjRvD29sbK1asqLy9ICIiojpNq0m3AQEBCAgIKPW+2NhY9Q3o6SEkJAQhISEV7v/FPoiIiOifjd8lRERERJLHwEJERESSV+XXYakrZM8K0NlaB0aPkoG7tTfnGT1KRmdrHcielf7lk0RERFLEwFJBhjmpiJ9mCpyYBpyo6Wq05wAgfpopEnJSAbjXdDlEREQVwsBSQQWmLeG8Pgdbt26Fg719TZejtYTERPj4+CCyf8uaLoWIiKjCGFgqSOgZ4kKaEvkN7IBmTjVdjtby05S4kKaE0Kv41yAQERHVtNo7GYOIiIj+MRhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPJ44TiqlfLy8gAA8fHxVdJ/fn4+UlJSIJfLYWRkVCXbSEhIqJJ+iYjqIgYWqpUSExMBAFOmTKnhSl6dmZlZTZdARCR5DCxUKw0ZMgQAYG9vD2Nj40rvPyEhAb6+voiKioKDg0Ol91/MzMwMbdu2rbL+iYjqCgYWqpUsLS0xefLkKt+Og4MDnJ2dq3w7RERUPk66JSIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJ41lCRPRKZM8K0NlaB0aPkoG7tfMzkNGjZHS21oHsWUFNl0JEZWBgIaJXYpiTivhppsCJacCJmq5GOw4A4qeZIiEnFYB7TZdDRKVgYCGiV1Jg2hLO63OwdetWONjb13Q5WklITISPjw8i+7es6VKIqAwMLET0SoSeIS6kKZHfwA5o5lTT5WglP02JC2lKCD3Dmi6FiMpQOw84ExER0T8KAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSZ5WgWXNmjWQy+UwNDSEm5sb4uLiym0fHh6Odu3awcjICDY2Npg7dy4KCgpU94eGhqJr164wMzNDkyZNMGTIECQlJWlTGhEREdVBGgeWHTt2IDAwECEhIYiPj4ejoyO8vLyQkZFRavtt27YhKCgIISEhSEhIQGRkJHbs2IEPP/xQ1eb48eOYMWMGzp49i+joaDx9+hRvv/02cnNztd8zIiIiqjP0NF0hLCwMU6ZMwYQJEwAAEREROHDgADZu3IigoKAS7U+fPg0PDw+MHTsWACCXyzFmzBicO3dO1ebw4cNq62zevBlNmjSBQqFAjx49NC2RiIiI6hiNRliePHkChUIBT0/P/3WgowNPT0+cOXOm1HXc3d2hUChUh41u3ryJgwcPon///mVuJysrCwDQqFGjMtsUFhYiOztb7UZERER1k0YjLJmZmSgqKoKVlZXacisrKyQmJpa6ztixY5GZmYnu3btDCIFnz55h+vTpaoeEnqdUKjFnzhx4eHigY8eOZdYSGhqKpUuXalI+ERER1VIaHxLSVGxsLFauXIm1a9fCzc0N169fx+zZs7F8+XIEBweXaD9jxgxcuXIFJ0+eLLffhQsXIjAwUPVzdnY2bGxsKr1+qhvy8vLKDNWlSUhIUPtXE/b29jA2NtZ4PSIiKptGgcXS0hK6urpIT09XW56eng5ra+tS1wkODsa4ceMwefJkAECnTp2Qm5uLqVOnYtGiRdDR+d9RqYCAAOzfvx8nTpxAixYtyq3FwMAABgYGmpRP/2CJiYlwcXHReD1fX1+N11EoFHB2dtZ4PSIiKptGgUVfXx8uLi6IiYnBkCFDAPx9CCcmJgYBAQGlrpOXl6cWSgBAV1cXACCEUP07c+ZM7NmzB7GxsWjVqpWm+0FULnt7eygUigq3z8/PR0pKCuRyOYyMjDTeFhERVS6NDwkFBgbC398fXbp0gaurK8LDw5Gbm6s6a8jPzw/NmzdHaGgoAMDb2xthYWHo3Lmz6pBQcHAwvL29VcFlxowZ2LZtG3744QeYmZkhLS0NAFC/fn2N/1gQlcbY2FjjUQ8PD48qqoaIiDSlcWAZNWoU7t+/jyVLliAtLQ1OTk44fPiwaiJuamqq2ojK4sWLIZPJsHjxYty5cweNGzeGt7c3VqxYoWqzbt06AECvXr3UtrVp0yaMHz9ei90iIiKiukSrSbcBAQFlHgKKjY1V34CeHkJCQhASElJmf8WHhoiIiIhKw+8SIiIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJ06vpAoiIiAjIy8tDYmJihdvn5+cjJSUFcrkcRkZGFV7P3t4exsbG2pRYoxhYiIiIJCAxMREuLi5Vvh2FQgFnZ+cq305lY2CpoLy8PABAfHx8lW1D27SsiYSEhCrpl4hqnuxZATpb68DoUTJwt/Ye8Td6lIzO1jqQPSuo6VKqlb29PRQKRYXbJyQkwNfXF1FRUXBwcNBoO7URA0sFFQ/TTZkypYYrqRxmZmY1XQIRVTLDnFTETzMFTkwDTtR0NdpzABA/zRQJOakA3Gu6nGpjbGys1ciHg4NDrRwx0ZRWgWXNmjX49NNPkZaWBkdHR6xevRqurq5ltg8PD8e6deuQmpoKS0tLDB8+HKGhoTA0NNS6z+o2ZMgQAFV77E/btKwpMzMztG3btsr6J6KaUWDaEs7rc7B161Y41NJP0QCQkJgIHx8fRPZvWdOlvJJr167h8ePHVdZ/8Yh5VY+cS+VvhsaBZceOHQgMDERERATc3NwQHh4OLy8vJCUloUmTJiXab9u2DUFBQdi4cSPc3d2RnJyM8ePHQyaTISwsTKs+a4KlpSUmT55cLdv6p6RlIqpcQs8QF9KUyG9gBzRzqulytJafpsSFNCWEnuHLG0vUtWvXYGdnVy3b8vX1rfJtJCcn13ho0TiwhIWFYcqUKZgwYQIAICIiAgcOHMDGjRsRFBRUov3p06fh4eGBsWPHAgDkcjnGjBmDc+fOad0nERGRlBWPrFTliHl1zXv09fWt0pGiitIosDx58gQKhQILFy5ULdPR0YGnpyfOnDlT6jru7u6IiopCXFwcXF1dcfPmTRw8eBDjxo3Tuk8AKCwsRGFhoern7OxsTXaFiIioylX1iLmHh0eV9S01GgWWzMxMFBUVwcrKSm25lZVVmeeOjx07FpmZmejevTuEEHj27BmmT5+ODz/8UOs+ASA0NBRLly7VpPxqp+k59a9yPLK2nldPRERUEVV+llBsbCxWrlyJtWvXws3NDdevX8fs2bOxfPlyBAcHa93vwoULERgYqPo5OzsbNjY2lVFypdH2nHptjkfW1vPqiYiIKkKjwGJpaQldXV2kp6erLU9PT4e1tXWp6wQHB2PcuHGqCaudOnVCbm4upk6dikWLFmnVJwAYGBjAwMBAk/Krnabn1L/K8cjael49ERFRRWgUWPT19eHi4oKYmBjVab5KpRIxMTEICAgodZ28vDzo6KhfwEhXVxcAIITQqs/aQptz6v9JxyOJiIgqSuNDQoGBgfD390eXLl3g6uqK8PBw5Obmqs7w8fPzQ/PmzREaGgoA8Pb2RlhYGDp37qw6JBQcHAxvb29VcHlZn0RE9HK8IjfVZRoHllGjRuH+/ftYsmQJ0tLS4OTkhMOHD6smzaampqqNqCxevBgymQyLFy/GnTt30LhxY3h7e2PFihUV7pOIiF6OV+SmukyrSbcBAQFlHq6JjY1V34CeHkJCQhASEqJ1n0RE9HK8IjfVZfwuISKiOoJX5Ka6rPZ+nScRERH9Y3CEhYiIqJLJnhWgs7UOjB4lA3dr79iA0aNkdLbWgexZQU2XwsBCRERU2QxzUhE/zRQ4MQ04UdPVaM8BQPw0UyTkpAJwr9FaGFiIiIgqWYFpSzivz8HWrVvhUIsv7JmQmAgfHx9E9m9Z06UwsBAREVU2oWeIC2lK5DewA5o51XQ5WstPU+JCmhJCz7CmS+GkWyIiIpI+BhYiIiKSPAYWIiIikjwGFiIiIpI8BhYiIiKSPJ4lRESvpC58QzC/HZhI+hhYiOiV1KVvCOa3AxNJFwMLEb2SuvINwfx2YKpMdWHkEZDW6CMDCxG9En5DMFFJdWnkEZDG6CMDCxERUSWrKyOPgHRGHxlYiIiIKhlHHisfAwsREZEE5OXlqQ4lVUTx/BJN55lU5ahPVWJgISIikoDExES4uLhovJ6vr69G7RUKRa0ckWFgISIikgB7e3soFIoKt9f2LCF7e3ttyqtxDCxEREQSYGxsrPHIh4eHRxVVIz28ND8RERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJHgMLERERSR4DCxEREUkeAwsRERFJnl5NF0BERDUjLy8PiYmJGq2TkJCg9m9F2dvbw9jYWKN1iJ7HwEJE9A+VmJgIFxcXrdb19fXVqL1CoYCzs7NW2yICGFiIiP6x7O3toVAoNFonPz8fKSkpkMvlMDIy0mhbRK9Cq8CyZs0afPrpp0hLS4OjoyNWr14NV1fXUtv26tULx48fL7G8f//+OHDgAAAgJycHQUFB2Lt3L/766y+0atUKs2bNwvTp07Upj4iIKsDY2FirUQ8PD48qqIaofBpPut2xYwcCAwMREhKC+Ph4ODo6wsvLCxkZGaW23717N+7du6e6XblyBbq6uhgxYoSqTWBgIA4fPoyoqCgkJCRgzpw5CAgIwL59+7TfMyIiIqozNA4sYWFhmDJlCiZMmID27dsjIiICxsbG2LhxY6ntGzVqBGtra9UtOjoaxsbGaoHl9OnT8Pf3R69evSCXyzF16lQ4OjoiLi5O+z0jIiKiOkOjwPLkyRMoFAp4enr+rwMdHXh6euLMmTMV6iMyMhKjR4+GiYmJapm7uzv27duHO3fuQAiBY8eOITk5GW+//XaZ/RQWFiI7O1vtRkRERHWTRoElMzMTRUVFsLKyUltuZWWFtLS0l64fFxeHK1euYPLkyWrLV69ejfbt26NFixbQ19dHv379sGbNGvTo0aPMvkJDQ1G/fn3VzcbGRpNdISIiolqkWi8cFxkZiU6dOpWYoLt69WqcPXsW+/btg0KhwOeff44ZM2bgp59+KrOvhQsXIisrS3W7fft2VZdPRERENUSjs4QsLS2hq6uL9PR0teXp6emwtrYud93c3Fxs374dy5YtU1uen5+PDz/8EHv27MGAAQMAAK+//jouXryIzz77TO3w0/MMDAxgYGCgSflERERUS2k0wqKvrw8XFxfExMSolimVSsTExKBbt27lrrtz504UFhaWuNjQ06dP8fTpU+joqJeiq6sLpVKpSXlERERUR2l8HZbAwED4+/ujS5cucHV1RXh4OHJzczFhwgQAgJ+fH5o3b47Q0FC19SIjIzFkyBBYWFioLTc3N0fPnj0xf/58GBkZwdbWFsePH8d//vMfhIWFvcKuERERUV2hcWAZNWoU7t+/jyVLliAtLQ1OTk44fPiwaiJuampqidGSpKQknDx5EkePHi21z+3bt2PhwoXw8fHBgwcPYGtrixUrVvDCcURERAQAkAkhRE0XURmys7NRv359ZGVlwdzcvKbLIaJKFB8fDxcXF34fDVEdVNG/39V6lhARERGRNhhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjyGFiIiIhI8hhYiIiISPIYWIiIiEjytAosa9asgVwuh6GhIdzc3BAXF1dm2169ekEmk5W4DRgwQK1dQkICBg0ahPr168PExARdu3ZFamqqNuURERFRHaNxYNmxYwcCAwMREhKC+Ph4ODo6wsvLCxkZGaW23717N+7du6e6XblyBbq6uhgxYoSqzY0bN9C9e3fY29sjNjYWly9fRnBwMAwNDbXfMyIiIqozZEIIockKbm5u6Nq1K77++msAgFKphI2NDWbOnImgoKCXrh8eHo4lS5bg3r17MDExAQCMHj0a9erVw7fffqvFLvwtOzsb9evXR1ZWFszNzbXuh4ikJz4+Hi4uLlAoFHB2dq7pcoioElX077dGIyxPnjyBQqGAp6fn/zrQ0YGnpyfOnDlToT4iIyMxevRoVVhRKpU4cOAA7Ozs4OXlhSZNmsDNzQ179+4tt5/CwkJkZ2er3YiIiKhu0iiwZGZmoqioCFZWVmrLrayskJaW9tL14+LicOXKFUyePFm1LCMjAzk5Ofj444/Rr18/HD16FEOHDsW7776L48ePl9lXaGgo6tevr7rZ2NhositERERUi1TrWUKRkZHo1KkTXF1dVcuUSiUAYPDgwZg7dy6cnJwQFBSEgQMHIiIiosy+Fi5ciKysLNXt9u3bVV4/ERER1QyNAoulpSV0dXWRnp6utjw9PR3W1tblrpubm4vt27dj0qRJJfrU09ND+/bt1ZY7ODiUe5aQgYEBzM3N1W5ERERUN2kUWPT19eHi4oKYmBjVMqVSiZiYGHTr1q3cdXfu3InCwkL4+vqW6LNr165ISkpSW56cnAxbW1tNyiMiIqI6Sk/TFQIDA+Hv748uXbrA1dUV4eHhyM3NxYQJEwAAfn5+aN68OUJDQ9XWi4yMxJAhQ2BhYVGiz/nz52PUqFHo0aMHevfujcOHD+PHH39EbGysdntFREREdYrGgWXUqFG4f/8+lixZgrS0NDg5OeHw4cOqibipqanQ0VEfuElKSsLJkydx9OjRUvscOnQoIiIiEBoailmzZqFdu3b4/vvv0b17dy12iYiIiOoaja/DIlW8DgtR3cXrsBDVXVVyHRYiIiKimsDAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJKnV9MFENE/S15eHhITEzVaJyEhQe3firK3t4exsbFG6xCRNDGwEFG1SkxMhIuLi1br+vr6atReoVDA2dlZq20RkbQwsBBRtbK3t4dCodBonfz8fKSkpEAul8PIyEijbRFR3SATQoiaLqIyZGdno379+sjKyoK5uXlNl0NEREQVUNG/35x0S0RERJLHwEJERESSx8BCREREkqdVYFmzZg3kcjkMDQ3h5uaGuLi4Mtv26tULMpmsxG3AgAGltp8+fTpkMhnCw8O1KY2IiIjqII0Dy44dOxAYGIiQkBDEx8fD0dERXl5eyMjIKLX97t27ce/ePdXtypUr0NXVxYgRI0q03bNnD86ePYtmzZppvidERERUZ2kcWMLCwjBlyhRMmDAB7du3R0REBIyNjbFx48ZS2zdq1AjW1taqW3R0NIyNjUsEljt37mDmzJnYunUr6tWrp93eEBERUZ2kUWB58uQJFAoFPD09/9eBjg48PT1x5syZCvURGRmJ0aNHw8TERLVMqVRi3LhxmD9/Pjp06FChfgoLC5Gdna12IyIiorpJo8CSmZmJoqIiWFlZqS23srJCWlraS9ePi4vDlStXMHnyZLXlq1atgp6eHmbNmlXhWkJDQ1G/fn3VzcbGpsLrEhERUe1SrWcJRUZGolOnTnB1dVUtUygU+PLLL7F582bIZLIK97Vw4UJkZWWpbrdv366KkomIiEgCNAoslpaW0NXVRXp6utry9PR0WFtbl7tubm4utm/fjkmTJqkt/+WXX5CRkYGWLVtCT08Penp6uHXrFubNmwe5XF5mfwYGBjA3N1e7ERERUd2kUWDR19eHi4sLYmJiVMuUSiViYmLQrVu3ctfduXMnCgsLS3x52bhx43D58mVcvHhRdWvWrBnmz5+PI0eOaFIeERER1VEaf/lhYGAg/P390aVLF7i6uiI8PBy5ubmYMGECAMDPzw/NmzdHaGio2nqRkZEYMmQILCws1JZbWFiUWFavXj1YW1ujXbt2mpZHREREdZDGgWXUqFG4f/8+lixZgrS0NDg5OeHw4cOqibipqanQ0VEfuElKSsLJkydx9OjRyqmaiIiI/lHqzLc1Z2VloUGDBrh9+zbnsxAREdUS2dnZsLGxwaNHj1C/fv0y22k8wiJVjx8/BgCe3kxERFQLPX78uNzAUmdGWJRKJe7evQszMzONTo+WkuKUyVGimsfnQlr4fEgHnwvpqCvPhRACjx8/RrNmzUpMKXlenRlh0dHRQYsWLWq6jErB07Slg8+FtPD5kA4+F9JRF56L8kZWilXrheOIiIiItMHAQkRERJLHwCIhBgYGCAkJgYGBQU2X8o/H50Ja+HxIB58L6finPRd1ZtItERER1V0cYSEiIiLJY2AhIiIiyWNgISIiIsljYCEiIiLJY2CpBDKZDHv37q3pMoiIyiWXyxEeHl7pbYmqQ50ILOPHj4dMJoNMJkO9evXQqlUr/N///R8KCgpqurQq9fx+P3+7fv16jdY0ZMgQSW73zz//hL6+Pjp27Fjq/cePH0efPn3QqFEjGBsbo23btvD398eTJ09UbTZs2ABHR0eYmpqiQYMG6Ny5M0JDQ9X6efDgAebMmQNbW1vo6+ujWbNmmDhxIlJTU195P+uC0p6rXbt2wdDQEJ9//rnqdf3xxx+rtdm7d6/a127ExsZCJpOhQ4cOKCoqUmvboEEDbN68uap2oVK9+P5lZWWFt956Cxs3boRSqazUbZ0/fx5Tp06t9LbaKOv9q/gml8urbNtSdf/+fbz33nto2bIlDAwMYG1tDS8vLxw/fhyWlpYlfieKLV++HFZWVnj69Ck2b94MmUwGBweHEu127txZqx/bOhFYAKBfv364d+8ebt68iS+++ALr169HSEhITZdV5Yr3+/lbq1attOrr+T/MddHmzZsxcuRIZGdn49y5c2r3Xb16Ff369UOXLl1w4sQJ/Pbbb1i9ejX09fVVfww3btyIOXPmYNasWbh48SJOnTqF//u//0NOTo6qnwcPHuCNN97ATz/9hIiICFy/fh3bt2/H9evX0bVrV9y8ebNa97k2+Pe//w0fHx+sW7cO8+bNAwAYGhpi1apVePjw4UvXv3nzJv7zn/9UdZlVqvj3OCUlBYcOHULv3r0xe/ZsDBw4EM+ePau07TRu3BjGxsaV3lYbX375pdr7FgBs2rRJ9fP58+fV2tf19ycAGDZsGC5cuIAtW7YgOTkZ+/btQ69evZCVlQVfX19s2rSpxDpCCGzevBl+fn6oV68eAMDExAQZGRk4c+aMWtvIyEi0bNmyWvalSog6wN/fXwwePFht2bvvvis6d+6s+jkzM1OMHj1aNGvWTBgZGYmOHTuKbdu2qa3Ts2dPMXPmTDF//nzRsGFDYWVlJUJCQtTaJCcnizfffFMYGBgIBwcHcfToUQFA7NmzR9Xm8uXLonfv3sLQ0FA0atRITJkyRTx+/LhEvStWrBBNmjQR9evXF0uXLhVPnz4VH3zwgWjYsKFo3ry52Lhxo8b7/bzY2FjRtWtXoa+vL6ytrcWCBQvE06dP1fZ3xowZYvbs2cLCwkL06tVLCCHEb7/9Jvr16ydMTExEkyZNhK+vr7h//75qvZ07d4qOHTuq9q9v374iJydHhISECABqt2PHjpW7D5XlZY+FUqkUr732mjh8+LBYsGCBmDJlitr9X3zxhZDL5eVuY/DgwWL8+PHltpk+fbowMTER9+7dU1uel5cnmjdvLvr161f+jvwDPP9crVq1ShgaGordu3er3T9w4EBhb28v5s+fr1q+Z88e8fxb1rFjxwQAMX/+fGFjYyMKCgpU99WvX19s2rSpyvelMpT12o2JiREAxIYNG1TLHj58KCZNmiQsLS2FmZmZ6N27t7h48aLaevv27RNdunQRBgYGwsLCQgwZMkR1n62trfjiiy+EEH//ToSEhAgbGxuhr68vmjZtKmbOnFlqWyGEuHXrlhg0aJAwMTERZmZmYsSIESItLU11f0hIiHB0dBT/+c9/hK2trTA3NxejRo0S2dnZFXocXnwftbW1FcuWLRPjxo0TZmZmwt/fXwghxC+//CK6d+8uDA0NRYsWLcTMmTNFTk6Oar2CggIxb9480axZM2FsbCxcXV2r7X3oVTx8+FAAELGxsaXef/nyZQFA/PLLL2rLi38PEhIShBBCbNq0SdSvX18EBASIyZMnq9rdvn1bGBgYiKCgIGFra1tl+1GV6swIy/OuXLmC06dPQ19fX7WsoKAALi4uOHDgAK5cuYKpU6di3LhxiIuLU1t3y5YtMDExwblz5/DJJ59g2bJliI6OBvD3N0K/++670NfXx7lz5xAREYEFCxaorZ+bmwsvLy80bNgQ58+fx86dO/HTTz8hICBArd3PP/+Mu3fv4sSJEwgLC0NISAgGDhyIhg0b4ty5c5g+fTqmTZuGP//8U6vH4M6dO+jfvz+6du2KS5cuYd26dYiMjMS//vWvEvurr6+PU6dOISIiAo8ePUKfPn3QuXNn/Prrrzh8+DDS09MxcuRIAMC9e/cwZswYTJw4EQkJCYiNjcW7774LIQQ++OADjBw5Um3Ux93dXav6K9uxY8eQl5cHT09P+Pr6Yvv27cjNzVXdb21tjXv37uHEiRNl9mFtbY2zZ8/i1q1bpd6vVCqxfft2+Pj4wNraWu0+IyMjvP/++zhy5AgePHhQOTtVyy1YsADLly/H/v37MXToULX7dHV1sXLlSqxevfqlvwNz5szBs2fPsHr16qost9r16dMHjo6O2L17t2rZiBEjkJGRgUOHDkGhUMDZ2Rl9+/ZVvaYOHDiAoUOHon///rhw4QJiYmLg6upaav/ff/+9ajT62rVr2Lt3Lzp16lRqW6VSicGDB+PBgwc4fvw4oqOjcfPmTYwaNUqt3Y0bN7B3717s378f+/fvx/Hjx8s8jFERn332GRwdHXHhwgUEBwfjxo0b6NevH4YNG4bLly9jx44dOHnypNr7a0BAAM6cOYPt27fj8uXLGDFiBPr164dr165pXUd1MDU1hampKfbu3YvCwsIS93fq1Aldu3bFxo0b1ZZv2rQJ7u7usLe3V1s+ceJEfPfdd8jLywPw9whzv379YGVlVXU7UdVqOjFVBn9/f6GrqytMTEyEgYGBACB0dHTErl27yl1vwIABYt68eaqfe/bsKbp3767WpmvXrmLBggVCCCGOHDki9PT0xJ07d1T3Hzp0SO2TwTfffCMaNmyolvgPHDggdHR0VJ9G/P39ha2trSgqKlK1adeunXjzzTdVPz979kyYmJiI//73vxXa7+Lb8OHDhRBCfPjhh6Jdu3ZCqVSq2q9Zs0aYmpqqttuzZ0+1USghhFi+fLl4++231Zbdvn1bABBJSUlCoVAIACIlJaXMmsob6agqL9vu2LFjxZw5c1Q/Ozo6qn0Cf/bsmRg/frwAIKytrcWQIUPE6tWrRVZWlqrN3bt3xRtvvCEACDs7O+Hv7y927NihejzT0tIEALVPpc/bvXu3ACDOnTv3Svta2/n7+wt9fX0BQMTExJR6f/Fz+cYbb4iJEycKIcoeYXn48KGIiIgQjRo1Eo8ePRJC1I0RFiGEGDVqlHBwcBBC/D2yYG5urjaSJIQQrVu3FuvXrxdCCNGtWzfh4+NT5raeHzX5/PPPhZ2dnXjy5MlL2x49elTo6uqK1NRU1f2///67ACDi4uKEEH+PsBgbG6uNqMyfP1+4ubmVvfPPQSkjLM+PDgkhxKRJk8TUqVPVlv3yyy9CR0dH5Ofni1u3bgldXV2192ghhOjbt69YuHBhheqoSbt27RINGzYUhoaGwt3dXSxcuFBcunRJdX9ERIQwNTVVjdhnZ2cLY2Nj8e9//1vVpniERQghnJycxJYtW4RSqRStW7cWP/zwg/jiiy84wlLTevfujYsXL+LcuXPw9/fHhAkTMGzYMNX9RUVFWL58OTp16oRGjRrB1NQUR44cKTER8vXXX1f7uWnTpsjIyAAAJCQkwMbGBs2aNVPd361bN7X2CQkJcHR0hImJiWqZh4cHlEolkpKSVMs6dOgAHZ3/PfxWVlZqn250dXVhYWGh2vbL9rv49tVXX6nq6Natm9okRQ8PD+Tk5Kh9YnVxcVHr79KlSzh27Jgq7ZuamqqS+40bN+Do6Ii+ffuiU6dOGDFiBDZs2FCheQY16dGjR9i9ezd8fX1Vy3x9fREZGan6WVdXF5s2bcKff/6JTz75BM2bN8fKlSvRoUMH1fH1pk2b4syZM/jtt98we/ZsPHv2DP7+/ujXr5/a5EjBb7t4qddffx1yuRwhISFqc4BetGrVKmzZsgUJCQnl9jdp0iRYWFhg1apVlV1qjRJCqH6HL126hJycHFhYWKj9fv7xxx+4ceMGAODixYvo27dvhfoeMWIE8vPz8dprr2HKlCnYs2dPmfNlit/7bGxsVMvat2+PBg0aqD03crkcZmZmqp+ff//URpcuXdR+vnTpEjZv3qy2/15eXlAqlfjjjz/w22+/oaioCHZ2dmptjh8/rnqMpGzYsGG4e/cu9u3bh379+iE2NhbOzs6qCeRjxoxBUVERvvvuOwDAjh07oKOjU2Kkq9jEiROxadMmHD9+HLm5uejfv3917UqVqDOBxcTEBG3atIGjoyM2btyIc+fOqf1B+vTTT/Hll19iwYIFOHbsGC5evAgvL68SE7mKJy0Vk8lklT5Tv6ztaLPt4v0uvjVt2lSjOp4PVgCQk5MDb29vtRB08eJFXLt2DT169ICuri6io6Nx6NAhtG/fHqtXr0a7du3wxx9/aLTd6rRt2zYUFBTAzc0Nenp60NPTw4IFC3Dy5EkkJyertW3evDnGjRuHr7/+Gr///jsKCgoQERGh1qZjx454//33ERUVhejoaERHR+P48eNo3LhxiTfw5yUkJEAmk6FNmzZVtq+1RfPmzREbG4s7d+6gX79+ePz4cantevToAS8vLyxcuLDc/vT09LBixQp8+eWXuHv3blWUXCMSEhJUk+hzcnLQtGnTEr+bSUlJmD9/PoC/Dz1WlI2NDZKSkrB27VrVIcsePXrg6dOnWtdb2e+fpb0/TZs2TW3/L126hGvXrqF169bIycmBrq4uFAqFWpuEhAR8+eWXWtdRnQwNDfHWW28hODgYp0+fxvjx41UnkJibm2P48OGqybebNm3CyJEjYWpqWmpfPj4+OHv2LD766COMGzcOenp61bYfVaHOBJbn6ejo4MMPP8TixYuRn58PADh16hQGDx4MX19fODo64rXXXivxx+plHBwccPv2bdUnbgA4e/ZsiTaXLl1Smx9x6tQp6OjooF27dq+wV5pxcHDAmTNn1D7tnzp1CmZmZmjRokWZ6zk7O+P333+HXC5XC0Jt2rRRvXnIZDJ4eHhg6dKluHDhAvT19bFnzx4AUDurRioiIyMxb968Em9yb775Zonjwc9r2LAhmjZtqvZcvqh9+/YA/p67pKOjg5EjR2Lbtm1IS0tTa5efn4+1a9fCy8sLjRo1qpwdq+VsbW1x/PhxpKWllRtaPv74Y/z4448lznh40YgRI9ChQwcsXbq0Ksqtdj///DN+++031Uixs7Mz0tLSoKenV+J309LSEsDfI1cxMTEV3oaRkRG8vb3x1VdfITY2VjWC+KLi977bt2+rll29ehWPHj1S/Q5UB2dnZ1y9erXE/rdp0wb6+vro3LkzioqKkJGRUeL+F+eV1Rbt27dXew+aNGkSTp48if379+P06dOYNGlSmes2atQIgwYNwvHjxzFx4sTqKLdK1cnAAvz95qWrq4s1a9YAANq2bYvo6GicPn0aCQkJmDZtGtLT0zXq09PTE3Z2dvD398elS5fwyy+/YNGiRWptfHx8YGhoCH9/f1y5cgXHjh3DzJkzMW7cuGqd7PT+++/j9u3bmDlzJhITE/HDDz8gJCQEgYGBaoeiXjRjxgw8ePAAY8aMwfnz53Hjxg0cOXIEEyZMQFFREc6dO4eVK1fi119/RWpqKnbv3o379++rzvmXy+W4fPkykpKSkJmZ+Uqf1jSVlZVV4tPnL7/8gvj4eEyePBkdO3ZUu40ZMwZbtmzBs2fPsH79erz33ns4evQobty4gd9//x0LFizA77//Dm9vbwDAe++9h+XLl+PUqVO4desWzp49Cz8/PzRu3Fh1aHDlypWwtrbGW2+9hUOHDuH27ds4ceIEvLy88PTpU9Xrkf5mY2OD2NhYZGRkwMvLC9nZ2SXadOrUCT4+PqrDneX5+OOPsXHjxnJDphQVFhYiLS0Nd+7cQXx8PFauXInBgwdj4MCB8PPzA/D3+0+3bt0wZMgQHD16FCkpKTh9+jQWLVqEX3/9FQAQEhKC//73vwgJCUFCQgJ+++23Mg+Tbd68GZGRkbhy5Qpu3ryJqKgoGBkZwdbWtkRbT09P1fMQHx+PuLg4+Pn5oWfPniUO21SlBQsW4PTp0wgICFCN/P7www+qSbd2dnbw8fGBn58fdu/ejT/++ANxcXEIDQ3FgQMHqq1Obfz111/o06cPoqKicPnyZfzxxx/YuXMnPvnkEwwePFjVrkePHmjTpg38/Pxgb2//0hMbNm/ejMzMzBKTcmujOhtY9PT0EBAQgE8++QS5ublYvHgxnJ2d4eXlhV69esHa2lrjC5zp6Ohgz549yM/Ph6urKyZPnowVK1aotTE2NladCdK1a1cMHz4cffv2xddff12Je/dyzZs3x8GDBxEXFwdHR0dMnz4dkyZNwuLFi8tdr1mzZjh16hSKiorw9ttvo1OnTpgzZw4aNGgAHR0dmJub48SJE+jfvz/s7OywePFifP7553jnnXcAAFOmTEG7du3QpUsXNG7cGKdOnaqO3QXw94XEOnfurHbbuHEj2rdvX+ov69ChQ5GRkYGDBw/C1dUVOTk5mD59Ojp06ICePXvi7Nmz2Lt3L3r27Ang7zfts2fPYsSIEbCzs8OwYcNgaGiImJgYWFhYAAAsLCxw9uxZ9O7dG9OmTUPr1q0xcuRItG7dGufPn8drr71WbY9HbdGiRQvExsYiMzOzzNCybNmyCh1a6NOnD/r06VOp1y6pDocPH0bTpk0hl8vRr18/HDt2DF999RV++OEH6OrqAvh7ZPPgwYPo0aMHJkyYADs7O4wePRq3bt1SfRjq1asXdu7ciX379sHJyQl9+vQpcSZksQYNGmDDhg3w8PDA66+/jp9++gk//vij6rX8PJlMhh9++AENGzZEjx494Onpiddeew07duyougelFK+//jqOHz+O5ORkvPnmm+jcuTOWLFmiNq9w06ZN8PPzw7x589CuXTsMGTIE58+fl/z1R0xNTeHm5oYvvvgCPXr0QMeOHREcHIwpU6ao/f2QyWSYOHEiHj58WKFREyMjo1Kf09pIJjhDkIiIiCSuzo6wEBERUd3BwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESSx8BCREREksfAQkRERJLHwEJERESS9/8AovyWu0vog/4AAAAASUVORK5CYII=",
            "text/plain": [
              "<Figure size 640x480 with 1 Axes>"
            ]
          },
          "metadata": {},
          "output_type": "display_data"
        }
      ],
      "source": [
        "matplotlib.pyplot.boxplot(results, labels=names)\n",
        "matplotlib.pyplot.title(\"Models' results' distribution of accuracy\")\n",
        "matplotlib.pyplot.show()"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "w6HrRE6U1YBS"
      },
      "source": [
        "### Generate the chosen model"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "oODOQLuOoFc2"
      },
      "source": [
        "Optimize the hyperparameters:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 31,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "eQhKbQgrqVbP",
        "outputId": "b1d10691-6e48-46f3-f490-e399e82663d6"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "The optimal hyperparameter 'C' is: 1.0535263157894736\n"
          ]
        }
      ],
      "source": [
        "model = lm.LogisticRegression(solver='liblinear', penalty='l1', max_iter=1000, random_state=config_dict['seed'])\n",
        "params = {\"C\": np.linspace(start=0.001, stop=10, num=20)}\n",
        "grid_search = GridSearchCV(model, params, scoring='accuracy')\n",
        "grid_search.fit(x_features_train, y_labels_train)\n",
        "print(\"The optimal hyperparameter 'C' is:\", grid_search.best_params_[\"C\"])"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "DjK2e9GaoO34"
      },
      "source": [
        "Fit the optimized model to the training set:"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 32,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/",
          "height": 74
        },
        "id": "KC8baVMOoPHk",
        "outputId": "57ecf5b8-d69b-4b3a-a197-2e00082ace8c"
      },
      "outputs": [
        {
          "data": {
            "text/html": [
              "<style>#sk-container-id-1 {color: black;background-color: white;}#sk-container-id-1 pre{padding: 0;}#sk-container-id-1 div.sk-toggleable {background-color: white;}#sk-container-id-1 label.sk-toggleable__label {cursor: pointer;display: block;width: 100%;margin-bottom: 0;padding: 0.3em;box-sizing: border-box;text-align: center;}#sk-container-id-1 label.sk-toggleable__label-arrow:before {content: \"▸\";float: left;margin-right: 0.25em;color: #696969;}#sk-container-id-1 label.sk-toggleable__label-arrow:hover:before {color: black;}#sk-container-id-1 div.sk-estimator:hover label.sk-toggleable__label-arrow:before {color: black;}#sk-container-id-1 div.sk-toggleable__content {max-height: 0;max-width: 0;overflow: hidden;text-align: left;background-color: #f0f8ff;}#sk-container-id-1 div.sk-toggleable__content pre {margin: 0.2em;color: black;border-radius: 0.25em;background-color: #f0f8ff;}#sk-container-id-1 input.sk-toggleable__control:checked~div.sk-toggleable__content {max-height: 200px;max-width: 100%;overflow: auto;}#sk-container-id-1 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before {content: \"▾\";}#sk-container-id-1 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 input.sk-hidden--visually {border: 0;clip: rect(1px 1px 1px 1px);clip: rect(1px, 1px, 1px, 1px);height: 1px;margin: -1px;overflow: hidden;padding: 0;position: absolute;width: 1px;}#sk-container-id-1 div.sk-estimator {font-family: monospace;background-color: #f0f8ff;border: 1px dotted black;border-radius: 0.25em;box-sizing: border-box;margin-bottom: 0.5em;}#sk-container-id-1 div.sk-estimator:hover {background-color: #d4ebff;}#sk-container-id-1 div.sk-parallel-item::after {content: \"\";width: 100%;border-bottom: 1px solid gray;flex-grow: 1;}#sk-container-id-1 div.sk-label:hover label.sk-toggleable__label {background-color: #d4ebff;}#sk-container-id-1 div.sk-serial::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: 0;}#sk-container-id-1 div.sk-serial {display: flex;flex-direction: column;align-items: center;background-color: white;padding-right: 0.2em;padding-left: 0.2em;position: relative;}#sk-container-id-1 div.sk-item {position: relative;z-index: 1;}#sk-container-id-1 div.sk-parallel {display: flex;align-items: stretch;justify-content: center;background-color: white;position: relative;}#sk-container-id-1 div.sk-item::before, #sk-container-id-1 div.sk-parallel-item::before {content: \"\";position: absolute;border-left: 1px solid gray;box-sizing: border-box;top: 0;bottom: 0;left: 50%;z-index: -1;}#sk-container-id-1 div.sk-parallel-item {display: flex;flex-direction: column;z-index: 1;position: relative;background-color: white;}#sk-container-id-1 div.sk-parallel-item:first-child::after {align-self: flex-end;width: 50%;}#sk-container-id-1 div.sk-parallel-item:last-child::after {align-self: flex-start;width: 50%;}#sk-container-id-1 div.sk-parallel-item:only-child::after {width: 0;}#sk-container-id-1 div.sk-dashed-wrapped {border: 1px dashed gray;margin: 0 0.4em 0.5em 0.4em;box-sizing: border-box;padding-bottom: 0.4em;background-color: white;}#sk-container-id-1 div.sk-label label {font-family: monospace;font-weight: bold;display: inline-block;line-height: 1.2em;}#sk-container-id-1 div.sk-label-container {text-align: center;}#sk-container-id-1 div.sk-container {/* jupyter's `normalize.less` sets `[hidden] { display: none; }` but bootstrap.min.css set `[hidden] { display: none !important; }` so we also need the `!important` here to be able to override the default hidden behavior on the sphinx rendered scikit-learn.org. See: https://github.com/scikit-learn/scikit-learn/issues/21755 */display: inline-block !important;position: relative;}#sk-container-id-1 div.sk-text-repr-fallback {display: none;}</style><div id=\"sk-container-id-1\" class=\"sk-top-container\"><div class=\"sk-text-repr-fallback\"><pre>LogisticRegression(C=1.0535263157894736, max_iter=1000, random_state=0)</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class=\"sk-container\" hidden><div class=\"sk-item\"><div class=\"sk-estimator sk-toggleable\"><input class=\"sk-toggleable__control sk-hidden--visually\" id=\"sk-estimator-id-1\" type=\"checkbox\" checked><label for=\"sk-estimator-id-1\" class=\"sk-toggleable__label sk-toggleable__label-arrow\">LogisticRegression</label><div class=\"sk-toggleable__content\"><pre>LogisticRegression(C=1.0535263157894736, max_iter=1000, random_state=0)</pre></div></div></div></div></div>"
            ],
            "text/plain": [
              "LogisticRegression(C=1.0535263157894736, max_iter=1000, random_state=0)"
            ]
          },
          "execution_count": 32,
          "metadata": {},
          "output_type": "execute_result"
        }
      ],
      "source": [
        "model = lm.LogisticRegression(C=grid_search.best_params_[\"C\"], max_iter=1000, random_state=config_dict['seed'])\n",
        "model.fit(x_features_train, y_labels_train)"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 33,
      "metadata": {
        "id": "I237S_fTDEqj"
      },
      "outputs": [],
      "source": [
        "# These two cells are for if we wanted to pick the Random Forest model:\n",
        "\n",
        "# model = RandomForestClassifier()\n",
        "# params = {\"n_estimators\": range(10, 31, 3),\n",
        "#           \"min_samples_split\": range(2, 10, 2)}\n",
        "# grid_search = GridSearchCV(model, params)\n",
        "# grid_search.fit(x_features_train, y_labels_train)\n",
        "\n",
        "# print(\"The optimal hyperparameter 'n_estimators' is:\", grid_search.best_params_[\"n_estimators\"])\n",
        "# print(\"The optimal hyperparameter 'min_samples_split' is:\", grid_search.best_params_[\"min_samples_split\"])\n"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 34,
      "metadata": {
        "id": "NuInjrRpPIK4"
      },
      "outputs": [],
      "source": [
        "# model = RandomForestClassifier(n_estimators=grid_search.best_params_[\"n_estimators\"],\n",
        "#                                min_samples_split=grid_search.best_params_[\"min_samples_split\"],\n",
        "#                                )\n",
        "# model.fit(x_features_train, y_labels_train)"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "ZinbMBQb8t5D"
      },
      "source": [
        "### Generate the train results: Use for Design Choices"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 35,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "J1A1nr-kSS0q",
        "outputId": "3032dd18-094e-4cfa-b406-16994faa82b2"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Results on the train set:\n",
            "-------------------------\n",
            "Baseline (dummy classifier) accuracy: 0.79\n",
            "Current model's accuracy: 0.88\n",
            "The accuracy lift is: 11 %\n"
          ]
        }
      ],
      "source": [
        "y_train_estimated = model.predict(x_features_train)\n",
        "accuracy_train = np.mean(y_train_estimated == y_labels_train)\n",
        "baseline_accuracy_train = np.mean(0 == y_labels_train)\n",
        "accuracy_lift_train = 100 * (accuracy_train/baseline_accuracy_train - 1)\n",
        "\n",
        "print(\"Results on the train set:\\n-------------------------\")\n",
        "print(\"Baseline (dummy classifier) accuracy:\", round(baseline_accuracy_train, 2))\n",
        "print(\"Current model's accuracy:\", round(accuracy_train, 2))\n",
        "print(\"The accuracy lift is:\", round(accuracy_lift_train), \"%\")"
      ]
    },
    {
      "attachments": {},
      "cell_type": "markdown",
      "metadata": {
        "id": "VRSrEo2j8y0P"
      },
      "source": [
        "### Generate the test results: Use for presenting performance"
      ]
    },
    {
      "cell_type": "code",
      "execution_count": 37,
      "metadata": {
        "colab": {
          "base_uri": "https://localhost:8080/"
        },
        "id": "msAAgYoBSiK3",
        "outputId": "b810802b-ac9e-4806-ca30-189ecab57848"
      },
      "outputs": [
        {
          "name": "stdout",
          "output_type": "stream",
          "text": [
            "Results on the test set:\n",
            "-------------------------\n",
            "Baseline (dummy classifier) accuracy: 0.8\n",
            "Current model's accuracy: 0.87\n",
            "The accuracy lift is: 10 %\n",
            "\n",
            "Confusion Matrix:\n",
            "[[3230  128]\n",
            " [ 408  455]]\n",
            "\n",
            "Classification Report:\n",
            "              precision    recall  f1-score   support\n",
            "\n",
            "           0       0.89      0.96      0.92      3358\n",
            "           1       0.78      0.53      0.63       863\n",
            "\n",
            "    accuracy                           0.87      4221\n",
            "   macro avg       0.83      0.74      0.78      4221\n",
            "weighted avg       0.87      0.87      0.86      4221\n",
            "\n"
          ]
        }
      ],
      "source": [
        "y_test_estimated = model.predict(x_features_test)\n",
        "accuracy_test = np.mean(y_test_estimated == y_labels_test)\n",
        "baseline_accuracy_test = np.mean(0 == y_labels_test)\n",
        "accuracy_lift = 100 * (accuracy_test/baseline_accuracy_test - 1)\n",
        "\n",
        "print(\"Results on the test set:\\n-------------------------\")\n",
        "print(\"Baseline (dummy classifier) accuracy:\", round(baseline_accuracy_test, 2))\n",
        "print(\"Current model's accuracy:\", round(accuracy_test, 2))\n",
        "print(\"The accuracy lift is:\", round(accuracy_lift), \"%\")\n",
        "\n",
        "\n",
        "print(\"\\nConfusion Matrix:\")\n",
        "print(confusion_matrix(y_labels_test, y_test_estimated))\n",
        "print(\"\\nClassification Report:\")\n",
        "print(classification_report(y_labels_test, y_test_estimated))"
      ]
    }
  ],
  "metadata": {
    "colab": {
      "provenance": []
    },
    "kernelspec": {
      "display_name": "Python 3",
      "name": "python3"
    },
    "language_info": {
      "name": "python"
    },
    "widgets": {
      "application/vnd.jupyter.widget-state+json": {
        "0b8d4b08978b4effab529c0e3d7edd55": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "14dc9b9eb36946199b4c3af3bcb26969": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_637d14198225420ebe39e1473c4e384b",
            "placeholder": "​",
            "style": "IPY_MODEL_7551693c02134155b91b9ab41663030b",
            "value": "Downloading readme: 100%"
          }
        },
        "1af3968ced8f452aa321dd1f4fe0da65": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "1fc676fcf8364c84bff1e5bfca25ff96": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_238e0af5085a4a93bb4d66e26610b7ee",
              "IPY_MODEL_66d7bd40c2344716b5b57a9cb19d275a",
              "IPY_MODEL_294b50433db146b7b368d5fd1a057568"
            ],
            "layout": "IPY_MODEL_6118939ea4c84b9fb24fed74afc39574"
          }
        },
        "238e0af5085a4a93bb4d66e26610b7ee": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_c2148be1cf704992bc76621abee97324",
            "placeholder": "​",
            "style": "IPY_MODEL_89468e4eecf94e7ebcf5bec407af5edd",
            "value": "Downloading data: 100%"
          }
        },
        "24e9d61d181041ab8a4e49218ab9aefd": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "27cd32faf9304275a4c916e5636c43db": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "294b50433db146b7b368d5fd1a057568": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_48b52c41ef6b4e5ab9196aa8e2d963d4",
            "placeholder": "​",
            "style": "IPY_MODEL_36926767be6e4706ae0c8d25979ca2b3",
            "value": " 2.39M/2.39M [00:00&lt;00:00, 4.48MB/s]"
          }
        },
        "2d3cc6d7b6554f538383fda744484ec8": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": "20px"
          }
        },
        "32758e000b91482e8f3ea6593ff2a38e": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_474a5a77556140c5b961a83331acd113",
              "IPY_MODEL_e276e3c1c48f48709e4f6a0ede6a6f11",
              "IPY_MODEL_52043fd851df47109b6f49792918df1e"
            ],
            "layout": "IPY_MODEL_8f9b145022124e6a93de9c7ee68e7dab"
          }
        },
        "36926767be6e4706ae0c8d25979ca2b3": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "370d37ca6a094e1ab3512ffb03ebabcc": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "3ace061f2c0b4196b329a18f45834a20": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_b381912409de4c268111d685e6ae9706",
              "IPY_MODEL_dfa1cdad61734a4dbf5e3865c8f09495",
              "IPY_MODEL_756364a0d24d4a3e92aa58e27a94d35d"
            ],
            "layout": "IPY_MODEL_4465c14c53d848c1b90abf46052a3160"
          }
        },
        "4465c14c53d848c1b90abf46052a3160": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "4560a66e9ac6449fa3cacd0635c8cd9c": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_f2eb53fae69a4a59b4c525041cbdf750",
              "IPY_MODEL_621c11b74e9045f587cf6718b4a072c2",
              "IPY_MODEL_ce00568869fd447086f301c78653331f"
            ],
            "layout": "IPY_MODEL_bf79ed7ce8d34a93a6fb01ad9f9f70d2"
          }
        },
        "474a5a77556140c5b961a83331acd113": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_6592833b05db42edbbbd75e9e1c23047",
            "placeholder": "​",
            "style": "IPY_MODEL_0b8d4b08978b4effab529c0e3d7edd55",
            "value": "Generating train split: "
          }
        },
        "48b52c41ef6b4e5ab9196aa8e2d963d4": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "491d61801d434be9aa737a6e432ddf73": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "4c9ee2bebe564ffda10ded823cdd6ed4": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "52043fd851df47109b6f49792918df1e": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_491d61801d434be9aa737a6e432ddf73",
            "placeholder": "​",
            "style": "IPY_MODEL_24e9d61d181041ab8a4e49218ab9aefd",
            "value": " 16990/0 [00:00&lt;00:00, 55252.48 examples/s]"
          }
        },
        "52ceda6fa527475090b5e400101615f7": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "5554cebca18943ecb933772e1ab390e7": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "5594c2f22edd4ddc936d5f24e88177d7": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "584d4df165a045b8b757b857239b1355": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "6118939ea4c84b9fb24fed74afc39574": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "621c11b74e9045f587cf6718b4a072c2": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_dc280c0ec8ce41389fc16979c9df55d6",
            "max": 580140,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_370d37ca6a094e1ab3512ffb03ebabcc",
            "value": 580140
          }
        },
        "637d14198225420ebe39e1473c4e384b": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "6592833b05db42edbbbd75e9e1c23047": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "66d7bd40c2344716b5b57a9cb19d275a": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_716d1dd8ad2942d3bc31b0377fc7382e",
            "max": 2385009,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_84d7f755cb8040ac8885318ef9986f7f",
            "value": 2385009
          }
        },
        "716d1dd8ad2942d3bc31b0377fc7382e": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "748de276f13f41a685e2af7072f3c715": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HBoxModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HBoxModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HBoxView",
            "box_style": "",
            "children": [
              "IPY_MODEL_14dc9b9eb36946199b4c3af3bcb26969",
              "IPY_MODEL_9d8d08f40e734ef1aa345d628893c621",
              "IPY_MODEL_fc122e69d02a47ccb5bb751ba273f25c"
            ],
            "layout": "IPY_MODEL_5554cebca18943ecb933772e1ab390e7"
          }
        },
        "7551693c02134155b91b9ab41663030b": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "756364a0d24d4a3e92aa58e27a94d35d": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_b4bce40ddf4d4c3b9e9398142b8bf935",
            "placeholder": "​",
            "style": "IPY_MODEL_cd5f938d3f8342e1b0adcd9b9aa627f5",
            "value": " 4117/0 [00:00&lt;00:00, 46325.70 examples/s]"
          }
        },
        "81fde237bd974ae4bac2f54b7e0e8fd9": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "84d7f755cb8040ac8885318ef9986f7f": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "ProgressStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "ProgressStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "bar_color": null,
            "description_width": ""
          }
        },
        "89468e4eecf94e7ebcf5bec407af5edd": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "8f9b145022124e6a93de9c7ee68e7dab": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "9d8d08f40e734ef1aa345d628893c621": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_beebbed93da24644b35592bc9fce8156",
            "max": 1971,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_81fde237bd974ae4bac2f54b7e0e8fd9",
            "value": 1971
          }
        },
        "aeedd1b5ca9b4d43a85fbd99ec21724e": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": "20px"
          }
        },
        "b381912409de4c268111d685e6ae9706": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_52ceda6fa527475090b5e400101615f7",
            "placeholder": "​",
            "style": "IPY_MODEL_1af3968ced8f452aa321dd1f4fe0da65",
            "value": "Generating validation split: "
          }
        },
        "b4bce40ddf4d4c3b9e9398142b8bf935": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "ba4fef970e684538aef0ddadb22e03d5": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "beebbed93da24644b35592bc9fce8156": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "bf79ed7ce8d34a93a6fb01ad9f9f70d2": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "c2148be1cf704992bc76621abee97324": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "c2acd202411447a19291cca13bc05a23": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "cd5f938d3f8342e1b0adcd9b9aa627f5": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "ce00568869fd447086f301c78653331f": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_27cd32faf9304275a4c916e5636c43db",
            "placeholder": "​",
            "style": "IPY_MODEL_d940b1cf380d43969bcdc9313603b691",
            "value": " 580k/580k [00:01&lt;00:00, 515kB/s]"
          }
        },
        "d940b1cf380d43969bcdc9313603b691": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "DescriptionStyleModel",
          "state": {
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "DescriptionStyleModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "StyleView",
            "description_width": ""
          }
        },
        "dc280c0ec8ce41389fc16979c9df55d6": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "dfa1cdad61734a4dbf5e3865c8f09495": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_2d3cc6d7b6554f538383fda744484ec8",
            "max": 1,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_4c9ee2bebe564ffda10ded823cdd6ed4",
            "value": 1
          }
        },
        "e276e3c1c48f48709e4f6a0ede6a6f11": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "FloatProgressModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "FloatProgressModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "ProgressView",
            "bar_style": "success",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_aeedd1b5ca9b4d43a85fbd99ec21724e",
            "max": 1,
            "min": 0,
            "orientation": "horizontal",
            "style": "IPY_MODEL_584d4df165a045b8b757b857239b1355",
            "value": 1
          }
        },
        "ebae02b066f04b5c89b14a135603e327": {
          "model_module": "@jupyter-widgets/base",
          "model_module_version": "1.2.0",
          "model_name": "LayoutModel",
          "state": {
            "_model_module": "@jupyter-widgets/base",
            "_model_module_version": "1.2.0",
            "_model_name": "LayoutModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/base",
            "_view_module_version": "1.2.0",
            "_view_name": "LayoutView",
            "align_content": null,
            "align_items": null,
            "align_self": null,
            "border": null,
            "bottom": null,
            "display": null,
            "flex": null,
            "flex_flow": null,
            "grid_area": null,
            "grid_auto_columns": null,
            "grid_auto_flow": null,
            "grid_auto_rows": null,
            "grid_column": null,
            "grid_gap": null,
            "grid_row": null,
            "grid_template_areas": null,
            "grid_template_columns": null,
            "grid_template_rows": null,
            "height": null,
            "justify_content": null,
            "justify_items": null,
            "left": null,
            "margin": null,
            "max_height": null,
            "max_width": null,
            "min_height": null,
            "min_width": null,
            "object_fit": null,
            "object_position": null,
            "order": null,
            "overflow": null,
            "overflow_x": null,
            "overflow_y": null,
            "padding": null,
            "right": null,
            "top": null,
            "visibility": null,
            "width": null
          }
        },
        "f2eb53fae69a4a59b4c525041cbdf750": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ebae02b066f04b5c89b14a135603e327",
            "placeholder": "​",
            "style": "IPY_MODEL_c2acd202411447a19291cca13bc05a23",
            "value": "Downloading data: 100%"
          }
        },
        "fc122e69d02a47ccb5bb751ba273f25c": {
          "model_module": "@jupyter-widgets/controls",
          "model_module_version": "1.5.0",
          "model_name": "HTMLModel",
          "state": {
            "_dom_classes": [],
            "_model_module": "@jupyter-widgets/controls",
            "_model_module_version": "1.5.0",
            "_model_name": "HTMLModel",
            "_view_count": null,
            "_view_module": "@jupyter-widgets/controls",
            "_view_module_version": "1.5.0",
            "_view_name": "HTMLView",
            "description": "",
            "description_tooltip": null,
            "layout": "IPY_MODEL_ba4fef970e684538aef0ddadb22e03d5",
            "placeholder": "​",
            "style": "IPY_MODEL_5594c2f22edd4ddc936d5f24e88177d7",
            "value": " 1.97k/1.97k [00:00&lt;00:00, 65.1kB/s]"
          }
        }
      }
    }
  },
  "nbformat": 4,
  "nbformat_minor": 0
}
